85e43f4 | aroot | 07 September 2022, 19:10:02 UTC | fix asm runner | 07 September 2022, 19:10:02 UTC |
cc4195c | Alexander Root | 07 September 2022, 18:24:32 UTC | add test for devbox | 07 September 2022, 18:24:32 UTC |
6b2f147 | Alexander Root | 07 September 2022, 18:19:35 UTC | attempt at using bounds inference in instruction selection | 07 September 2022, 18:19:35 UTC |
a57e022 | Alexander Root | 01 September 2022, 01:30:28 UTC | Merge branch 'main' of github.com:halide/Halide into rootjalex/x86-optimize | 01 September 2022, 01:30:28 UTC |
95e37ee | Steven Johnson | 31 August 2022, 21:44:17 UTC | Improve error-handling in Python Extensions (#6986) * Improve error-handling in Python Extensions Currently, Python Extensions don't make any effort to override `halide_error`, so the default (which aborts) is generally used... this is very unfriendly. This modifies the standard Python Extension glue code to hook halide_error, saving the text in a thread local, and then throwing a Python exception after the extension's AOT call is finished (if an error occurred, of course). Also does a drive-by default hooking of `halide_print` to ensure that it goes to whatever Python thinks that `stdout` is. (Note that it would be really nice if we could use closures of some sort for halide_error, halide_print, etc so that we could save context in the actual Python module, rather than in a thread-local global var, but this currently isn't possible without nontrivial refactoring in the Halide runtime.) * Make Windows happy * Remove dangling code bits * Allow defeating of error-handler via HALIDE_PYTHON_EXTENSION_OMIT_ERROR_AND_PRINT_HANDLERS | 31 August 2022, 21:44:17 UTC |
e531e24 | Alex Reinking | 31 August 2022, 16:55:51 UTC | Fix markdown links (#6988) | 31 August 2022, 16:55:51 UTC |
386e1ee | Steven Johnson | 30 August 2022, 20:19:58 UTC | Add `add_halide_runtime` rule (#6985) Fixes #6981 | 30 August 2022, 20:19:58 UTC |
e7c1c86 | Steven Johnson | 29 August 2022, 23:09:20 UTC | Add test for _Halide_target_export_single_symbol (#6983) Add test for _Halide_target_export_single_symbol | 29 August 2022, 23:09:20 UTC |
c24b406 | Steven Johnson | 29 August 2022, 20:48:23 UTC | Add add_halide_python_extension_library() rule (#6979) * Add add_halide_python_extension_library() rule This adds a rule to create a single Python extension library from one (or more) halide_library rules. This allows you to package multiple Halide filters into a single Python module, which is nice because (1) being able to organize is good, and (2) all the filters in a single Python extension module share the same Halide runtime, including (e.g.) thread pools and method overrides. (It also removes the just-recently-added PYTHON_EXTENSION_LIBRARY option from the add_halide_library rule, as this new rule is better and more flexible in pretty much every way.) This modifies the content of our `python_extension` output in such a way that existing uses should be completely unaffected, but defining the right preprocessor macros allows us to split the function wrappers up from the method-definition declaration, so we don't have to generate any new code artifiacts to make this work. Partially addresses #6956. * Omits -D in target_compile_definitions * be explicit about setting to empty * Add quotes * Add comments re BUILD_INTERFACE * Add MODULE_NAME comment * Remove "defined in HalideGeneratorHelpers.cmake" * Add comment re add_halide_runtime() * osx, macos, darwin, oh m * blankity blank blank * Use OBJECT library instead * Add comment about X-macros * Update HalideGeneratorHelpers.cmake | 29 August 2022, 20:48:23 UTC |
5a09dda | Alex Reinking | 26 August 2022, 00:42:26 UTC | Fix XCode by wrapping weights in an OBJECT library (#6977) The XCode "new build system" doesn't like generated source files to be associated with more than one target. Going through an OBJECT library like this fixes that problem, but also saves us a compilation, so it's a good thing to do anyway. Fixes #6976 | 26 August 2022, 00:42:26 UTC |
50018c4 | Zalman Stern | 25 August 2022, 23:01:39 UTC | Small refactor to remove confusion between CodeGen_LLVM and CodeGen_Internal. (#6973) * Small refactor to remove confusion between CodeGen_LLVM and CodeGen_Internal. | 25 August 2022, 23:01:39 UTC |
f6eb2bf | Alexander Root | 24 August 2022, 17:02:38 UTC | update SpecificExpr comment + remove dangling TODO comments | 24 August 2022, 17:02:38 UTC |
1eb0e94 | Alexander Root | 24 August 2022, 16:49:48 UTC | clang format | 24 August 2022, 16:49:48 UTC |
3b0dc43 | Alexander Root | 24 August 2022, 16:35:35 UTC | missing && | 24 August 2022, 16:35:35 UTC |
5258627 | Alexander Root | 24 August 2022, 16:23:10 UTC | Merge branch 'main' of github.com:halide/Halide into rootjalex/x86-optimize | 24 August 2022, 16:23:10 UTC |
4877b9f | Alexander Root | 24 August 2022, 16:21:24 UTC | Lower saturating_cast in bounds inference (#6970) * lower saturating_cast in bounds inference * openGL fix to saturating_cast | 24 August 2022, 16:21:24 UTC |
292d8e5 | Alexander Root | 24 August 2022, 16:10:53 UTC | clang format | 24 August 2022, 16:10:53 UTC |
9a5327c | Alexander Root | 24 August 2022, 16:06:47 UTC | add better type checking in IRMatch for SpecificExpr cases | 24 August 2022, 16:06:47 UTC |
068f1e2 | Alex Reinking | 24 August 2022, 00:35:10 UTC | Don't cache Halide_ASAN_ENABLED (#6969) check_cxx_symbol_exists saves its output in the cache and does not run if its destination variable is defined. This is OK when used to test something that necessitates a totally fresh configuration, like any property of the target architecture, which would require changing the toolchain file. However, the result here can change if someone just modifies CMAKE_CXX_FLAGS, so it can get out of sync in some cases. | 24 August 2022, 00:35:10 UTC |
eacadb6 | Alex Reinking | 24 August 2022, 00:35:00 UTC | Use CMake target to handle vendored SPIRV headers (#6968) | 24 August 2022, 00:35:00 UTC |
2f0957b | Alex Reinking | 23 August 2022, 19:38:51 UTC | CMake packaging fixes (#6966) * Add Halide_ASAN_ENABLED to package. * Fix handling of optional components. Make PNG/JPEG optional. * Make it easier to find HalideHelpers Before this change, users would either need to set Halide_ROOT to a Halide installation path or add said path to CMAKE_PREFIX_PATH. If they tried to use a different mechanism, like setting Halide_DIR or directly annotating their find_package call with HINTS or PATHS, then it would fail to find HalideHelpers.cmake. Adding this hint inside HalideConfig.cmake makes the package more robust, while still respecting the more powerful Halide_ROOT and CMAKE_PREFIX_PATH variables. * Delete undocumented variables in HalideConfig Some of our package's internal variables and macros leak out into user builds. We don't want users to use any of these. We might hit Hyrum's Law here, but I hope not. Users of these variables and macros should seek other means. | 23 August 2022, 19:38:51 UTC |
ca6319b | Alex Reinking | 23 August 2022, 04:07:09 UTC | Some minor top-level CMakeLists.txt reorganization (#6957) * Disables the usage warning when CMAKE_BUILD_TYPE is defined, but explicitly empty. * Overrides C++ standard variables using the cache (CMake 3.21+) * Allows including projects to build our tests, etc. but disables by default via PROJECT_IS_TOP_LEVEL (CMake 3.21+) * Removes misleading distrib target. * Removes deprecated clang-format target (use ./run-clang-format.sh instead) | 23 August 2022, 04:07:09 UTC |
dc4d1f7 | Alexander Root | 23 August 2022, 03:59:14 UTC | i8 -> u8 bugfix | 23 August 2022, 03:59:14 UTC |
e2045bf | Alexander Root | 23 August 2022, 02:46:59 UTC | place Expr constants on the stack | 23 August 2022, 02:46:59 UTC |
37f7514 | Steven Johnson | 23 August 2022, 00:40:17 UTC | Python: don't crash for repr(Expr()) (#6962) | 23 August 2022, 00:40:17 UTC |
671b26d | Steven Johnson | 23 August 2022, 00:39:28 UTC | Enable deprecations warnings (#6555) * Enable deprecations warnings We currently disable deprecation warnings inside Halide. This re-enables them there, and also inside add_halide_generator(). | 23 August 2022, 00:39:28 UTC |
f7a30e0 | Alex Reinking | 22 August 2022, 18:35:26 UTC | Fix RPATH for Python wheels on macOS (#6958) | 22 August 2022, 18:35:26 UTC |
bbfefd2 | Alexander Root | 22 August 2022, 16:28:54 UTC | Merge branch 'main' of github.com:halide/Halide into rootjalex/x86-optimize | 22 August 2022, 16:28:54 UTC |
fd3bec3 | Alexander Root | 22 August 2022, 16:08:34 UTC | [HVX] Fix state_var issue (#6894) * fix HVX state_var issue * abort if host is nullptr | 22 August 2022, 16:08:34 UTC |
4bcd6fa | Steven Johnson | 19 August 2022, 21:17:21 UTC | Remove add_python_stub_extension(), adding the functionality to add_halide_generator() instead (#6952) * Remove add_python_aot_extension() rule in CMake Move it into `add_halide_library` instead, as another output option. (add_python_stub_extension will likely be moved as well, in a subsequent PR) * Update README_cmake.md * Skip Python tests when compiling for WASM * PyStubs * Update README_cmake.md * Update Generator.h * Update CMakeLists.txt * Revert "Update CMakeLists.txt" This reverts commit ed5bb00283f0e4fbdbea74adf497d4ff93b0c8d1. * fixes * fixes * Update CMakeLists.txt * fixes * fixes * fixes * Remove LIBRARY DESTINATION * Update CMakeLists.txt * fixup packaging Co-authored-by: Alex Reinking <reinking@google.com> | 19 August 2022, 21:17:21 UTC |
a1cd71c | Alex Reinking | 19 August 2022, 03:55:04 UTC | Build fixes for manylinux2014 (#6953) | 19 August 2022, 03:55:04 UTC |
1068403 | Steven Johnson | 17 August 2022, 23:20:43 UTC | Remove add_python_aot_extension() rule in CMake (#6949) * Remove add_python_aot_extension() rule in CMake Move it into `add_halide_library` instead, as another output option. (add_python_stub_extension will likely be moved as well, in a subsequent PR) * Update README_cmake.md * Skip Python tests when compiling for WASM | 17 August 2022, 23:20:43 UTC |
dd5fe8d | Alex Reinking | 17 August 2022, 22:17:40 UTC | Two quick build fixes (#6950) * ASLog is linked to autoscheduler MODULES; needs PIC * Out-of-source Python bindings just need libHalide, not imageio * Fixes to setup.py | 17 August 2022, 22:17:40 UTC |
807d988 | Alexander Root | 17 August 2022, 19:33:38 UTC | Handle saturating_cast in compute_expr_cost() (#6947) | 17 August 2022, 19:33:38 UTC |
5100ad6 | Steven Johnson | 17 August 2022, 01:06:52 UTC | Don't throw an exception from generate_filter_main (#6946) | 17 August 2022, 01:06:52 UTC |
4f5c53c | Steven Johnson | 16 August 2022, 22:15:53 UTC | Add/update Python Readme (#6939) * Add/update Python Readme This moves the Python README to the toplevel and reworks it considerably, adding details and updating various bits. Note that the Python documentation here is still incomplete; this is intended as a prelude to adding documentation for Python Generators in a future PR. | 16 August 2022, 22:15:53 UTC |
63d563f | Steven Johnson | 16 August 2022, 21:38:42 UTC | Export HalidePythonExtensionHelpers.cmake for installs (#6941) * Export HalidePythonExtensionHelpers.cmake for installs * oops * fixes * Fix broken code in target_export_script() * oops #2 * Add WITH_SOABI to stubs as well as AOT * More fixes * Update CMakePresets.json * Update CMakePresets.json | 16 August 2022, 21:38:42 UTC |
52b91a4 | Andrew Adams | 14 August 2022, 17:24:30 UTC | Add minimal useful implementation of extracting and concatenating bits (#6928) * Minimal approach to making Deinterleave correct for Reinterpret * Add minimal useful implementation of extracting and concatenating bits * clang-tidy * More clang-tidy fixes * Add missing error message * Add low-bit-depth noise test * Add test to cmake build * Fix power-of-two check * Remove dead object * Add little-endian comment to reinterpret IR node * Simplify concat_bits of single arg * Add missing second arg * Fix concat_bits call Co-authored-by: Andrew Adams <anadams@adobe.com> | 14 August 2022, 17:24:30 UTC |
f60a8fb | Steven Johnson | 12 August 2022, 20:45:37 UTC | Fix badly-merged CMakePresets.json file (#6936) | 12 August 2022, 20:45:37 UTC |
5e8f97b | Steven Johnson | 11 August 2022, 22:57:54 UTC | Add ASAN support to CMake via toolchain file (#6920) Add ASAN support Co-authored-by: Alex Reinking <reinking@google.com> | 11 August 2022, 22:57:54 UTC |
b734957 | Steven Johnson | 11 August 2022, 19:35:28 UTC | Add build & test presets for release and debug CMake builds (#6934) Also, drive-by rename of 'default' to 'base' to better imply the inheritance | 11 August 2022, 19:35:28 UTC |
4cdc2a1 | Steven Johnson | 11 August 2022, 19:30:57 UTC | Tutorial 10 needs to be skipped for Python when targeting Wasm (just as non-Python does) (#6932) * Tutorial 10 needs to be skipped for Python when targeting Wasm (just as non-Python does) * fixes * Update CMakeLists.txt | 11 August 2022, 19:30:57 UTC |
43e6a26 | Steven Johnson | 10 August 2022, 22:05:11 UTC | Rework internal PYTHONPATH maintenance (#6922) * Rework PYTHONPATH * Move pure-Python file copying logic to build time. * Use TARGET_RUNTIME_DLLS to copy all DLLs instead of just Halide. * Ensure that the last path component for Halide_Python is always `halide` * Simplify __init__.py now that it's copied to build tree * Add helper to de-duplicate PYTHONPATH test logic Fixes #6870 Co-authored-by: Alex Reinking <alex.reinking@gmail.com> Co-authored-by: Alex Reinking <reinking@google.com> | 10 August 2022, 22:05:11 UTC |
92de4a1 | Steven Johnson | 10 August 2022, 17:30:46 UTC | Halide::Error should not extend std::runtime_error (#6927) * Halide::Error should not extend std::runtime_error Unfortunately, the std error/exception classes aren't marked for DLLEXPORT under MSVC; we need our Error classes to be DLLEXPORT for libHalide (and python bindings). The current situation basically causes MSVC to generator another version of `std::runtime_error` marked for DLLEXPORT, which can lead to ODR violations, which are bad. AFAICT we don't really rely on this inheritance anywhere, so this just eliminates the inheritance entirely. (Note that I can't point to a specific malfunction resulting from this, but casual googling based on the many warnings MSVC emits about the current situation has me convinced that it needs addressing.) * noexcept | 10 August 2022, 17:30:46 UTC |
22d17e7 | Alexander Root | 08 August 2022, 21:55:30 UTC | fix namespace issue | 08 August 2022, 21:55:30 UTC |
a98f268 | Alexander Root | 08 August 2022, 21:34:38 UTC | update x86 saturating_cast rules using intrinsic | 08 August 2022, 21:34:38 UTC |
258b72c | Alexander Root | 08 August 2022, 21:26:33 UTC | merge conflict | 08 August 2022, 21:26:33 UTC |
1bf1599 | Alexander Root | 08 August 2022, 21:24:16 UTC | Make saturating_cast an intrinsic (#6900) * Make saturating_cast an intrinsic * handle saturating_cast in Bounds.cpp + add bounds tests * update saturating_cast CodeGen * with_lanes should work on intrinsics as well * lift to saturating_cast in FindIntrinsics * update intrinsics test for u16_sat * better sat_cast(widen(expr)) handling in find_intrinsics * simplify bounds of saturating_cast + update is_monotonic | 08 August 2022, 21:24:16 UTC |
19b2c5e | Alexander Root | 08 August 2022, 20:57:30 UTC | rm stray 'protected' | 08 August 2022, 20:57:30 UTC |
8794fac | Alex Reinking | 08 August 2022, 20:44:32 UTC | Make use of CMake 3.22 features (#6919) * Remove AddCudaToTarget.cmake * Remove MakeShellPath.cmake * Use CheckLinkerFlag in TargetExportScript * Use DEPFILE for all generators * Use REQUIRED with find_program, where applicable * Use REQUIRED with find_library, where applicable * Use CMake 3.21 cache behavior in HalideTargetHelpers.cmake * Replace uses of get_filename_component with cmake_path * Rework BLAS detection in linear_algebra app * Drive-by: fix autotune_loop.sh install rule. * Fix CBLAS header in linear_algebra test_halide_blas | 08 August 2022, 20:44:32 UTC |
9da91da | Alexander Root | 08 August 2022, 19:25:08 UTC | Merge branch 'rootjalex/x86-optimize' of github.com:halide/Halide into rootjalex/x86-optimize | 08 August 2022, 19:25:08 UTC |
7cc3b64 | Alexander Root | 08 August 2022, 19:24:33 UTC | Merge branch 'main' of github.com:halide/Halide into rootjalex/x86-optimize | 08 August 2022, 19:24:33 UTC |
6e67ddf | Alexander Root | 08 August 2022, 19:23:52 UTC | implement pattern matching for SapphireRapids | 08 August 2022, 19:23:52 UTC |
9ca7560 | Steven Johnson | 05 August 2022, 02:20:29 UTC | Fix wrong install path for *.py files (#6921) * Fix wrong install path for *.py files We were looking in a nonexistent dir, so we never copied `__init__.py` as we should have. * Update CMakeLists.txt | 05 August 2022, 02:20:29 UTC |
256c4d9 | Andrew Adams | 04 August 2022, 20:21:01 UTC | Fix bug when realize condition depends on tuple call (#6915) If the realization is tuple-valued, and the condition on the realization uses a tuple call (index != 0), then the condition wasn't getting resolved during the split_tuples pass. The cause was a missing mutate call. | 04 August 2022, 20:21:01 UTC |
ffa2c36 | Steven Johnson | 04 August 2022, 20:10:37 UTC | Fix two warnings found with clang 16 (#6918) - variable 'count' set but not used - warning: use of bitwise '|' with boolean operands | 04 August 2022, 20:10:37 UTC |
3a04fc0 | Alex Reinking | 04 August 2022, 14:00:45 UTC | Remove unused GHA and packaging workflows. (#6917) | 04 August 2022, 14:00:45 UTC |
cc44ee5 | Steven Johnson | 04 August 2022, 01:40:22 UTC | Upgrade CMake minimum version to 3.22 (#6916) Fixes #6910. | 04 August 2022, 01:40:22 UTC |
857b045 | Steven Johnson | 03 August 2022, 22:24:43 UTC | LICENSE.txt: add BLAS license. (#6914) | 03 August 2022, 22:24:43 UTC |
a893d5e | Steven Johnson | 03 August 2022, 22:23:44 UTC | LICENSE.txt: add spirv license (#6913) | 03 August 2022, 22:23:44 UTC |
0072946 | Steven Johnson | 03 August 2022, 22:23:28 UTC | LICENSE.txt: Include full text of Apache 2.0 license (not just the 'header' version) (#6912) | 03 August 2022, 22:23:28 UTC |
88e7229 | Alex Reinking | 02 August 2022, 20:55:53 UTC | Start developing pip package (#6886) Co-authored-by: Lukas Trümper <lukas.truemper@outlook.de> | 02 August 2022, 20:55:53 UTC |
dd391e6 | Steven Johnson | 02 August 2022, 20:32:46 UTC | Fix broken Makefile rules for autoschedulers on OSX (#6906) * Fix broken Makefile rules for autoschedulers on OSX A few issues here: - Make was building the plugins as .dylib on OSX, but they should have been .so to match Linux (and just on general principles) - On OSX, explicitly linking libHalide.dylib into a plugin means that it will load its own copy of libHalide, which is bad, because it means the plugin doesn't share the same set of globals. We need to omit that explicit dependency and allow it to just find the exported symbols at load time. - Add a test to verify the fix; run it everywhere even though it should only have been failing for Make-build OSX builds. Finally, let me add that we really need to set a sunset date for supporting Make in Halide. The Makefiles aren't really maintained properly anymore, and when something subtle goes wrong, it takes an unreasonable amount of time to debug for something that is no longer our canonical build tool. * Use order-only prerequisites * Remove new load_plugin.cpp test Not worth the complexity for the extra test coverage. | 02 August 2022, 20:32:46 UTC |
cd0fe8a | Alexander Root | 02 August 2022, 17:31:19 UTC | clang format | 02 August 2022, 17:31:19 UTC |
545fbe8 | Alexander Root | 02 August 2022, 17:28:13 UTC | lower mod in InstructionSelector too | 02 August 2022, 17:28:13 UTC |
2239119 | Andrew Adams | 02 August 2022, 15:24:39 UTC | Fix autoscheduling trivial lut wrappers (#6905) * Fix autoscheduling trivial lut wrappers Fixes #6899 * trigger buildbots Co-authored-by: Steven Johnson <srj@google.com> | 02 August 2022, 15:24:39 UTC |
8871404 | Frederik | 01 August 2022, 23:23:28 UTC | Allow AMX instructions with K dimension larger than 4 bytes (#6582) * recognize the patterns used for the RHS matrix * make 1d tile matcher more robust * put getting rhs tile's index into a separate func * expand the tests used in correctness check * add exclamation mark * remove unused vars * run format and tidy * check for null before using IR in the next step * check if the broadcast was found * llvm below 13 is no longer supported * replace single pattern with commutative permutations * check if the stride is an `IntImm`, otherwise reject pattern * apply clang-format-13 * rename wild_i32 -> v2 * check if v1 could be the stride value * add more detail to a receiving a bad type * added short explanation of the right-hand matrix layout * added explanation for where the 4 comes from * provide further documentation as to the layout of AMX * add comments for expected patterns to get_3d_rhs_tile_index * Document the matched pattern Co-authored-by: Steven Johnson <srj@google.com> | 01 August 2022, 23:23:28 UTC |
703a738 | Steven Johnson | 01 August 2022, 20:08:13 UTC | Upgrade clang-format and clang-tidy to v14 (v2) (#6902) | 01 August 2022, 20:08:13 UTC |
e35654b | Alexander Root | 01 August 2022, 17:35:48 UTC | Don't try to fold saturating_sub of VectorReduce (#6896) don't fold saturating_sub of VectorReduce | 01 August 2022, 17:35:48 UTC |
e03b0e0 | Roman Lebedev | 01 August 2022, 16:19:30 UTC | [Codegen_LLVM] Annotate LLVM IR functions with `nounwind`/`mustprogress` attributes (#6897) My reasoning is as follows, please correct me if i'm wrong: 1. Halide-generated code never throws exceptions 2. Halide-generated code always `call`s (as opposed to `invoke`s) the functions, there is no exception-safety RAII 3. Halide loops are meant to have finite number of iterations, they aren't meant to be endless and side-effect free 4. Halide (IR) assertions *might* abort. 5. Likewise, external callees *might* abort. (???) Therefore, when not in presence of external calls, it is obvious that (1) no exception will be unwinded out of the halide-generated function, (2) none of the loops will end up being endless with no observable side-effects. ... which is the semantics that is being stated by the LLVM IR function attributes `nounwind`+`mustprogress`. I'm less clear as to what are the prerequisites on the behavior of the external callees, but i do believe that they must also at least not unwind. I guess they are also at least required to either return or abort eventually. | 01 August 2022, 16:19:30 UTC |
0739045 | Roman Lebedev | 01 August 2022, 16:18:55 UTC | [Simplify] Drop no-op single-input identity shuffles (#6901) | 01 August 2022, 16:18:55 UTC |
6cc77b2 | Steven Johnson | 29 July 2022, 22:47:33 UTC | Add `auto_schedule` label to Adams2019 and Li2018 tests in CMake (#6898) * Add `auto_schedule` label to Adams2019 and Li2018 tests in CMake These were ~never getting tested on the buildbots (and still aren't, I need to update it to run `auto_schedule` tests) but conceptually these tests should be in the same group as for Mullapudi. Also, drive-by fix to broken test_apps_autoscheduler injected in https://github.com/halide/Halide/pull/6861. * trigger buildbots | 29 July 2022, 22:47:33 UTC |
9c25902 | Derek Gerstmann | 29 July 2022, 22:05:35 UTC | [vulkan phase1] Add SPIR-V IR (#6882) * Import SPIRV-IR from personal branch * Refactor SPIR-V IR into separate header / source files. * Refactory SPIR-V factory methods. Fix SPIR-V interface library and header paths. Add SPIR-V internal test. * Hookup internal SPIRV IR test * Fixes and cleanups to address PR #6882 Refactor logic of SPIR-V dependency to make fetch dependecy optional Change SPIR-V fetch dependency to avoid building and just populate contents Change SPIR-V internal test to always link against method ... only enabled if WITH_SPIRV is defined Add missing SPIRV target feature * Update src/CMakeLists.txt Co-authored-by: Alex Reinking <reinking@google.com> * Add missing iostream header when WITH_SPIRV is undefined * Fix declaration ordering for TARGET_SPIRV option so that dependencies get triggered * Turn on FETCH_SPIRV_HEADERS by default to get build to pass for now * Fix path finding logic for SPIR-V header path from populated fetch dependency * Revert back to Halide_SPIRV target name * Don't use imported interface for SPIR-V. Use Halide_SPIRV naming since target is defined before Halide itself. * Add local copy of SPIR-V header file, along with license and readme. Update CMake rules to use local include path by default. * Make SPIR-V include path a system path to avoid clang format/tidy processing * Remove SpirvIR.h header file from being included with Halide.h (since it's only used internally for CodeGen) * Add ./dependencies/spirv to clang format ignore file * Add comment about *not* including internally used headers like SpirvIR.h * Refactor is_defined() asserts into check_defined() for reuse * Add comment to SpirvIR.h header clarifying this file should not be exported. Fix formatting to avoid single line if statements. Use reserve for constructing vector components * Rename hash_* methods to make_*_key methods (since they construct a key and don't actually hash the value) Fix typo on components * Clang format/tidy pass * Fix formatting for more single-line if statements * Disable TARGET_SPIRV by default for now Co-authored-by: Derek Gerstmann <dgerstmann@adobe.com> Co-authored-by: Alex Reinking <reinking@google.com> | 29 July 2022, 22:05:35 UTC |
40f575c | Alexander Root | 29 July 2022, 04:29:46 UTC | fix x86 saturating_narrow pattern mistake | 29 July 2022, 04:29:46 UTC |
b3b3551 | Alexander Root | 28 July 2022, 17:52:12 UTC | fix case without WITH_X86 | 28 July 2022, 17:52:12 UTC |
0e5cfcf | Alexander Root | 28 July 2022, 17:39:17 UTC | temporary HVX/CSE fix | 28 July 2022, 17:39:17 UTC |
ec2cd4e | Alexander Root | 28 July 2022, 16:41:49 UTC | address nits | 28 July 2022, 16:41:49 UTC |
fa2d4e2 | Alexander Root | 28 July 2022, 05:09:31 UTC | remove 'implement VI visitor' error msg | 28 July 2022, 05:09:31 UTC |
6d2bfd1 | Alexander Root | 28 July 2022, 05:05:26 UTC | fix virtual func hidden error | 28 July 2022, 05:05:26 UTC |
e6502f8 | Alexander Root | 28 July 2022, 04:32:28 UTC | fix last remnants of vector intrinsic -> vector instruction renaming | 28 July 2022, 04:32:28 UTC |
c21bec5 | Alexander Root | 28 July 2022, 04:24:05 UTC | clang format | 28 July 2022, 04:24:05 UTC |
870af00 | Alexander Root | 28 July 2022, 04:22:13 UTC | fix merge conflict + implement psadbw | 28 July 2022, 04:22:13 UTC |
3648ca6 | Alexander Root | 28 July 2022, 04:06:39 UTC | implement a base class for instruction selection | 28 July 2022, 04:06:39 UTC |
11690d7 | Alexander Root | 28 July 2022, 03:42:12 UTC | disable UB for VectorInstruction node | 28 July 2022, 03:42:12 UTC |
b9a3356 | Steven Johnson | 27 July 2022, 23:32:14 UTC | Remove (most) of the env var usage from Adams2019 (#6861) * Move ASLog.cpp/.h to common/ * Add trivial Parsing utility & use it * Update ParamParser.h * fixes * wip * fixes * Fixes * clang-format * Update Makefile * Remove may_subtile * Update Cache.cpp * Update Cache.cpp * Update AutoSchedule.cpp * Update AutoSchedule.cpp | 27 July 2022, 23:32:14 UTC |
3859b36 | Andrew Adams | 27 July 2022, 22:12:36 UTC | Add support for generating x86 sum-of-absolute-difference reductions (#6872) | 27 July 2022, 22:12:36 UTC |
6c74a63 | Alexander Root | 27 July 2022, 21:36:57 UTC | clang format | 27 July 2022, 21:36:57 UTC |
17c9924 | Alexander Root | 27 July 2022, 21:36:14 UTC | fully remove saturating_pmulhrs | 27 July 2022, 21:36:14 UTC |
339b6b7 | Alexander Root | 27 July 2022, 21:34:02 UTC | undef -> poison | 27 July 2022, 21:34:02 UTC |
d660816 | Alexander Root | 27 July 2022, 21:31:57 UTC | Merge branch 'main' of github.com:halide/Halide into rootjalex/x86-optimize | 27 July 2022, 21:31:57 UTC |
f092606 | Alexander Root | 27 July 2022, 21:31:44 UTC | implement Andrew's requested changes | 27 July 2022, 21:31:44 UTC |
c8b811a | Steven Johnson | 27 July 2022, 20:20:16 UTC | Fixes to allow compiling with LLVM16 (#6889) | 27 July 2022, 20:20:16 UTC |
e3e169d | Steven Johnson | 27 July 2022, 16:21:44 UTC | Rewrite PythonExtensionGen to be C++ based (#6888) * Rewrite PythonExtensionGen to be C++ based This is intended as an alternative to #6885 -- this is even *more* gratuitous, but: - We have ~always compiled Python extensions using C++ anyway - This code is arguably terser, cleaner, and safer (the cleanups happen via dtors) - The code size difference is negligible (~300 bytes out of 160k for addconstant.cpython-39-darwin.so) * Update PythonExtensionGen.cpp | 27 July 2022, 16:21:44 UTC |
fb82166 | Alexander Root | 26 July 2022, 19:14:13 UTC | fix MSVC templating bug | 26 July 2022, 19:14:13 UTC |
6471226 | Alexander Root | 26 July 2022, 17:02:16 UTC | clang tidy | 26 July 2022, 17:02:16 UTC |
0675e86 | Alexander Root | 26 July 2022, 16:43:36 UTC | attempt to fix x86 vector-reduction splitting | 26 July 2022, 16:43:36 UTC |
2cfc0c1 | Alexander Root | 26 July 2022, 15:06:59 UTC | fix absd codegen bug | 26 July 2022, 15:06:59 UTC |
53c560b | Alexander Root | 26 July 2022, 14:57:05 UTC | fix virtual function hidden error | 26 July 2022, 14:57:05 UTC |
78edb81 | Alexander Root | 26 July 2022, 05:18:30 UTC | clang format | 26 July 2022, 05:18:30 UTC |
c2a6175 | Alexander Root | 26 July 2022, 05:16:01 UTC | fix instruction selection location | 26 July 2022, 05:16:01 UTC |