https://github.com/halide/Halide

sort by:
Revision Author Date Message Commit Date
45b767e clang-format 06 September 2023, 21:57:29 UTC
bc29e4a Merge remote-tracking branch 'origin/main' into abadams/zen4 06 September 2023, 21:30:37 UTC
836879e Enable emission of float16/32 casts on x86 (#7837) * Enable emission of float16/32 casts on x86 Fixes #7836 Fixes #4166 * Fix comment * Don't catch bfloat casts * Fix missing word in comment 06 September 2023, 21:29:27 UTC
9d94842 Fix constant in comment 06 September 2023, 00:13:19 UTC
d058212 Add missing enum 05 September 2023, 23:57:08 UTC
2eaa568 Don't use llvm's bfloat type at all 05 September 2023, 23:15:50 UTC
58488e3 Give up on native bfloat16 conversion for now 05 September 2023, 23:13:39 UTC
8122dc1 Merge branch 'abadams/enable_f16c' into abadams/zen4 05 September 2023, 22:49:51 UTC
2d48d37 Use llvm BFloat type for bfloat intrinsics 05 September 2023, 22:31:53 UTC
efb69cd Fix Zen4 model number 05 September 2023, 22:19:36 UTC
0b12c24 Don't catch bfloat casts 05 September 2023, 22:07:46 UTC
264e062 Fix comment 05 September 2023, 21:31:04 UTC
5143840 Fix runtime detection, sapphire rapids CPUID bits 05 September 2023, 21:17:48 UTC
02865e2 Add a check that PredicateLoads must be used in the outermost split of a dimension (#7788) * add a check that PredicateLoads must be used in the outermost split of a dimension * newline * use the repro example * fix * avoid check for every other tail strategy * update error message to point out what's not allowed --------- Co-authored-by: Steven Johnson <srj@google.com> 05 September 2023, 20:28:11 UTC
edffe44 Add avx512_Zen4 target flag It's a superset of cannon lake, and a subset of sapphire rapids 05 September 2023, 18:57:23 UTC
8b85321 Add support for zen4 05 September 2023, 17:55:29 UTC
00d29bd Enable emission of float16/32 casts on x86 Fixes #7836 Fixes #4166 05 September 2023, 17:13:39 UTC
8188b42 Avoid generating name collisions in CSE (#7821) * Avoid generating name collisions in CSE Alternative to #7801 (See the discussion there) Fixes #4124 * Add missing test * Minor cleanup * clang-format 01 September 2023, 17:38:19 UTC
ddfb1dc Don't return an undefined Stmt() from IfThenElse visitor (#7816) Fixes #7815 01 September 2023, 17:37:50 UTC
24d846c Remove dead `auto-schedule` label in CMake (#7818) These were replaced by more granular labels. Also, drive-by fix to comment that needed plurals. 30 August 2023, 23:54:48 UTC
afc61b2 Update 'Check CMake file lists' action (#7809) * Update 'Check CMake file lists' action Several subcategories were missing -- let's add them and see if they should be there or not * bogus change * Add missing comments * Revert "bogus change" This reverts commit 80454b1313e1c06b5432d15287fa1f51185f70b6. 30 August 2023, 23:54:09 UTC
3a1dffe Move clang-tidy checks back to Linux (#7817) * Move clang-tidy checks back to Linux Recent changes in the GHA runners for macOS don't play well with clang-tidy; rather than sink any more time into debugging it, I'm going to revert the relevant parts of #7746 so that it runs on the less-finicky Linux runners instead. * bogus * Update Generator.cpp * Update Generator.cpp 29 August 2023, 16:23:44 UTC
fa136cb Ensure that multitarget AOT builds have consistent random sequence (#7717) * Fix CMake test for generator_aot_multitarget * Ensure that multitarget AOT builds have consistent random numbers If a Generator uses random_float() (or the int or uint versions), and is used in a multitarget build, we weren't resetting the counters for random generation between each subtarget... meaning that each subtarget would get a different random sequence, leading to some ery hard-to-debug test failures when running on different hardware variants. This PR ensures that the relevant counters are all reset before each subtarget is generated, so that each should see the same sequence of random number generation. * Update CMakeLists.txt * Update multitarget_aottest.cpp * Combine float/uint counters 29 August 2023, 16:21:59 UTC
fe9f0b7 [serialization] Add serialization support to generator interface (#7792) * Add serialization support to Generator interface * Clang format pass * Make target required when emitting a serialized pipeline (since schedule may be target dependent). Apply auto-scheduler before serialization so that schedules can be serialized. * Fix enum ordering for hlpipe. Fix hlpipe comments. Add missing hlpipe enum to pyenums. * Remove unused Serialization build_mode * Fix formatting * Remove unused serializable flag. Remove redundant cpp_stub check. Fix comments. * Safeguard emit_hlpipe calls with #ifdef WITH_SERIALIZATION --------- Co-authored-by: Derek Gerstmann <dgerstmann@adobe.com> Co-authored-by: Steven Johnson <srj@google.com> 28 August 2023, 18:13:53 UTC
79d2be3 Update clang-tidy action to stop breaking (#7808) * Switch clang-tidy action from macos-13 to macos-latest `macos-latest` is actually macos-12 (macos13 is considered "beta" on the GHA runners). Hopefully this will fix the recent install snafus that are breaking clang-tidy. * Bogus change to trigger check * Update presubmit.yml * Update presubmit.yml * Update presubmit.yml * Revert "Bogus change to trigger check" This reverts commit a70f9ed8e6032d4b7799ff0cf6c009a7d2f92b3a. * Update presubmit.yml 28 August 2023, 17:21:38 UTC
8ac1e1c Add jump-buttons to get fro Stmt directly to Assembly (#7793) Co-authored-by: Steven Johnson <srj@google.com> 28 August 2023, 16:46:45 UTC
69c75b3 Update WebGPU to latest Emscripten/Dawn API (#7804) * Update WebGPU to latest Emscripten/Dawn API - Updated mini_webgpu.h to be in sync with Dawn as of commit ded6610f45a8826db37b52d73121a66b74d8aa61 - Updated the use of SetDeviceLost callbacks to be in the DeviceDescriptor instead of a separate call - Updated a couple of fields that got renamed - Update webgpu.cpp and gpu_context.h to always use wgpuCreateInstance() and wgpuInstanceRelease(), since the Dawn node bindings now support & require them * clang-tidy 24 August 2023, 23:12:19 UTC
84faa68 [wasm] Enable PIC for WebAssembly on LLVM v18.x (#7803) * Enable PIC code generation for WebAssembly for LLVM >18. Enable +mutable-globals to support dynamic linking * Fix LLVM v18 interface changes for writeArchive() Add RelLookupTableConverterPass for PIC (in LLVM v18) * Resolve conflict for writeArchive interface changes. * Clang format pass --------- Co-authored-by: Derek Gerstmann <dgerstmann@adobe.com> 24 August 2023, 22:22:22 UTC
84af2cd Add support to the makefile for serialization (#7762) * Add support to the makefile for serialization * Fix deps * Fix for no flatc, and for homebrew --------- Co-authored-by: Steven Johnson <srj@google.com> 24 August 2023, 22:18:09 UTC
f56b9ad Remove some unused includes (#7799) 24 August 2023, 21:48:26 UTC
678ea32 [ARM] support new udot/sdot patterns (#7800) 24 August 2023, 19:49:57 UTC
88c75ec [ARM] Distribute shifts as muls (#7790) * [ARM] distribute shifts as muls This reverts commit eba8f325edfaaa7b11c52a19435200f6b28e539a. --------- Co-authored-by: Steven Johnson <srj@google.com> 24 August 2023, 17:31:26 UTC
e8df5cf Fix for top-of-tree LLVM (#7798) 23 August 2023, 18:05:37 UTC
acc9413 Don't inject undef() in the simplifier (#7791) We shouldn't be using undef() in the simplifier. This replaces a load with a constant false predicate with a zero instead. I also added a guard around some dubious logic about out of bounds loads. out of bounds loads may be reachable if they have a false predicate, so I changed this simplification to only trigger if the load is unpredicated. 22 August 2023, 15:49:44 UTC
6efecbe slice IRMatcher should only match on slices (#7772) * slice IRMatcher should only match on slices Fixes #7768 * Add test 22 August 2023, 15:49:29 UTC
fcc1c3b [Hexagon] -Build Hexagon runtime components using the Hexagon SDK (Clone of #7671) (#7741) * Add CMakeLists.txt to build the hexagon_remote runtime. * Print an error message if libhalide_hexagon_host.so is not found. * Fix case mismatch in hexagon_remote/CMakeLists.txt * Remove some code that had been commented out in hexagon_remote/CMakeLists.txt * Remove unused argument in macro in hexagon_remote/CMakeLists.txt * add find module for Hexagon * move more variables to find module * Build binary modules with ExternalProject * group platform-speicifc sources into subdirectories * Pass HEXAGON_TOOLS_ROOT, too * Use the desired layout for the build-tree artifacts * Use SYSTEM for Hexagon SDK include dirs * trigger buildbots * Ignore code in src/runtime/hexagon_remote/bin/src for clang-tidy * Just skip hexagon_remote entirely for Halide_CLANG_TIDY_BUILD * Add an option to enable the building of the hexagon remote runtime --------- Co-authored-by: Alex Reinking <quic_areinkin@quicinc.com> Co-authored-by: Steven Johnson <srj@google.com> 21 August 2023, 21:16:51 UTC
708d41b Don't introduce reinterprets in find/lower intrinsics (#7776) 21 August 2023, 18:45:05 UTC
f11e80d Fix out of bounds access in anderson2021_test_apps_autoscheduler (#7771) * Fix out of bounds access in anderson2021_test_apps_autoscheduler * clang-format 21 August 2023, 17:06:07 UTC
36eb0b2 Try to fix remaining ASAN-reported leaks (#7767) This fixes all but one of the known remaining ASAN-related leaks; the remaining is in `tutorial_lesson_19_wrapper_funcs` I can't debug that one locally because the leaks are in OpenCL and I am temporarily relegated to using a 'cloud' machine with no real GPU for linux-x64 -- if someone with access to such a machine could take a look, I'd appreciate it (examples of leakage at https://buildbot.halide-lang.org/master/#/builders/154/builds/79/steps/12/logs/tutorial_lesson_19_wrapper_funcs) 21 August 2023, 17:05:25 UTC
c50d11a Speedup page loading of VizStmt. (#7755) * Speedup page loading of VizStmt. Disabled line numbers in the syntax highlihgting of the assembly. Made syntax highlighting on-demand with a button. * Fix computedStyleMap() not available in Firefox. * Reanble assembly highlighting by default. 21 August 2023, 17:00:17 UTC
840ed4d Remove fragile simd_op_check test for mlal/mlsl on ARM (#7775) 18 August 2023, 22:31:32 UTC
4e6fe00 Fix vector reduce HTML (#7773) VectorReduce: Div cannot be in Span 17 August 2023, 18:07:11 UTC
f2f2af2 Define `cast<i32>(u32)` overflow behavior (#7769) uint32 -> int32 casting should not produce SIO 17 August 2023, 16:11:21 UTC
f75f68d Experimental serializer (#7594) * init * sync * single func pipeline round-trip test * roundtrip test framework completed, single output function tested, no Dag yet * serialize Stmt, partially done (cuz no support of Expr yet), not fully tested * deviceAPI MemoryType ForType * Expr, with a grain of salt * fix exprs in stmts * format everything * Range * fix undefined exprs and stmts * address some review comments: - proper using - proper includes - rename Serdes -> Serialize * address more review comments - rename .hlb/.hlr to .hlpipe - reserve vectors - proper memory management * deserialize_expr_vector * support bound, storageDim, loopLevel and funcSchedule * Specialization, Definition * sync commit * temporarily comment out func mapping stuff to remove blockers * helper funcs * call_type and reduction_domain * ModulusRemainder and VectorReduceOp, some minor refactoring * prefetch directive * name mangling and closing on function's odds and ends * split * dim * stage schedule * tidy * parameter * more parameter * check nullptr and some minor fix * fix crashing * func index replacing func ptr during serialization * extern func arg, some minor cleanup * replace cerr with halide assert * buffer?? * remove printer * fix * wrappers in func_schedule * clear func mapping to use serializer for more than 1 pipelines, use unordered_map also * attempt to move serialization into core, get cmake working for now * fix * we maybe don't need submodule * fix cmake * make headers work again, with some hacks ofc * serialization now lives in libHalide * testing 101 * don't include flatbuffers header in Halide.h * fix * namespace adjust * user_assert * fix a missing field * fix missing type info in some exprs * fix bug in function mapping * fix function DAG broken issue * format * rm cout in cpp files and change test group name * fix the case func ptr is not defined * add a missing call type deserialization * serialize unique parameters * serialize unique buffers * fix missing type in parameter * fix a missing tail stra * change find_transive_call to build_enviroment to include wrappers in the DAG * upstream current test strategy, intercept JIT compilation for each pipeline, serdes ronudtrip and back * make sure buffer memory layout are the same * don't use ir comparator to compare pipelines, we will use jit tests from now * don't serialize Parameter's buffer, compute external buffers from Call, Variable and ExternFuncArgument and don't serialize them as well * fix, 35 tests remaining * fix output function orders * reuse jit_externs since we cannot really serialize it, 29 tests to go * fix that buffer_constraints, host_alignment and memory_type are incorrectly removed, also add missing exact in split * only use outputs and requirements from deserialized pipeline during testing * nits * add missing requirements during deserialization * restore original pipeline's contents after lowering * address some review comments * Install flatbuffers for clang-tidy * use std::map to make results the same on different compiler * proper way to handle cropped buffers * fix cmake build using alex's branch * try set flatbuffers_DIR explicitly * case sensitive? * rename serialization test env var * cleanup Serialization.cpp * format * have halide version embedded in the file identifier * nits and comments * format * try make clang-tidy happy and const a lot of things * const more things * support istream input * nit * add template function deserialize_vector * nit * attempt to integrate serialization test * line breaks * remove hack in compile_jit, at least for now * fix * add #ifdef guards * format * try nolint * special case two files so clang-tidy will be happy * Make Flatbuffers-missing error more useful * Make a few final changes - change BUILD_SERIALIZATION -> WITH_SERIALIZATION to match other flags better - fix capitalization of the CMake package (must be `FlatBuffers` for some Linux usage) - add stub calls to the de/serialization calls when building without Flatbuffers * Oops addition * clang-format * Add temporary debug hackery * more hackery * grr * sdfsdf * sigh, capitalization * One more try * Update presubmit.yml * No more mr nice guy * Update CMakeLists.txt * Revise build rules & script to allow clang-tidy for the new files * Update CMakeLists.txt * Apply clang-tidy fixes * Fix target for generated header * Prefer to use FetchContent for flatbuffers * Fixes * set PIC on * more pic * fix attempt * fix attempt * try macos * coreutils * Update run-clang-tidy.sh * noquiet * final again? --------- Co-authored-by: Steven Johnson <srj@google.com> 11 August 2023, 21:45:38 UTC
93514c3 StmtViz: Search for tooltip only in the child node (#7754) Search for tooltip only in the child node Further cut ~5 second of StmtVisualizer rendering by searching for the tooltip text-box in the child node of the current button. Previously, the script compose the global ID with regular expression, and then search the entire DOM causing delays. 10 August 2023, 20:42:49 UTC
7054828 Improve error-handling in Anderson2021, and ensure build deps are cor… (#7748) * Improve error-handling in Anderson2021, and ensure build deps are correct * clang-format 10 August 2023, 17:01:12 UTC
150a930 [vulkan] Fix SPIR-V IR references causing leaks (#7739) * Remove unnecessary parent refs and owning function/block refs. Add explicit clear methods for contents structs and destructors. * Move objects when changing ownership --------- Co-authored-by: Derek Gerstmann <dgerstmann@adobe.com> 07 August 2023, 19:30:41 UTC
7b45542 [vulkan] Fix heap buffer overflow in Vulkan extension handling discovered by ASAN (#7740) Fix heap buffer overflow in Vulkan extension handling discovered by ASAN Co-authored-by: Derek Gerstmann <dgerstmann@adobe.com> 07 August 2023, 19:29:25 UTC
2f5c4d2 Revert accidental typo change in #7746 (#7747) 07 August 2023, 18:14:59 UTC
af56605 Permit llvm 15 on windows (#7744) Our build instructions for windows are currently broken, because vcpkg is still on llvm 15. This PR unbreaks them. Re-enabling any testing of llvm 15 to be discussed. 07 August 2023, 16:24:03 UTC
25028cd Allow optional sorting of profiler output via HL_PROFILER_SORT env var (Fixes #7638) (#7639) * Allow optional sorting of profiler output via HL_PROFILER_SORT env var (Fixes #7638) * trigger buildbots * Update profiler_common.cpp * Update float16_t.cpp * Update float16_t.cpp * Update float16_t.cpp * Update float16_t.cpp 07 August 2023, 16:13:41 UTC
c254043 Fix leaks in test/correctness/memoize.cpp (#7705) * Fix leaks caused by self-referential parameter constraints * Add comment * Add missing overrides * Fix reported leaks in memoize test by explicitly releasing the shared runtime at the end of the test * Use const refs for non-mutated args * Hopefully fix for windows * Fix for 32-bit pointers * Don't use _aligned_malloc It requires _aligned_free, which the runtime aint gonna do * Fix other memoize test * Use runtime built-in malloc/free On windows mixing and matching mallocs and frees doesn't work well. * Fix comment --------- Co-authored-by: Steven Johnson <srj@google.com> 05 August 2023, 18:52:46 UTC
f39576f Fix infinite recursion in loop partitioning (#7743) * Fix infinite recursion in partition loops We weren't stripping the likely tags off the unlikely case on a store/load predicate, resulting in infinite recursion. * Add test * Remove accidental return 05 August 2023, 18:52:23 UTC
87087f1 Run clang-tidy on macOS runners instead of Linux (#7746) * Run clang-tidy on macOS runners instead of LInux The current macOS runners have twice the RAM and more CPU power. Also, drive-by change to allow specifying the parallelism that the run-clang-tidy script should use (defaults to nproc) * Update Generator.cpp * Update run-clang-tidy.sh * Update run-clang-tidy.sh 04 August 2023, 23:36:26 UTC
48b3df6 Speedup the VizIR HTML. (#7713) * From 12s to 2s, by eliminating the bulk of the $() calls. * Speed up recursive depth function by not using jQuery. * Changed out CodeMirror for Speed-Highlight. Additionally several fixes regarding the StmtViz. --------- Co-authored-by: Steven Johnson <srj@google.com> 04 August 2023, 17:00:41 UTC
bc30d6f Revise labels on autoscheduler tests (#7732) * Revise labels on autoscheduler tests This is step 1 in fixing https://github.com/halide/Halide/issues/7731: it replaces the `autoschedulers` tag with more granular ones, so that we can modify the build script to test the right autoscheduler(s) for a given backend. (Note that the `autoschedulers` tag was unused by the buildbots, which only used the generic `auto_schedule` tag.) Step 2 will be to modify the buildbot script after this lands to use the new tags above. Step 3 will be to remove the `auto_schedule` tag. * Fix anderson2021 labels 03 August 2023, 00:18:33 UTC
734df3f Clean up really long line lengths in Anderson2021 (#7728) * Clean up really long line lengths in Anderson2021 We don't have an explicit line length limit in Halide, but generally consider 120 to be a reasonable extent; a lot of code in Anderson2021 went waaaay over this limit, especially function/method calls. I did a semi-manual cleanup to try to clean up the worst offenders. Should be 100% cosmetic. * Add LoopNestMap * Fixes 02 August 2023, 18:12:25 UTC
ef24391 Ignore code in src/runtime/hexagon_remote/bin/src for clang-format (#7736) 02 August 2023, 17:21:17 UTC
8fe4f99 Fix leak on cloning functions with update defs (#7735) * Fix leak on cloning functions with update defs When cloning a Func with an update def, the remapping map resulting from the deep copy may already contain a key for the wrapped function pointing to a strong reference to itself. The reasons are unclear to me, but it means that emplace silently does nothing and we get a memory leak because the cloned Func's update definition has a strong self-reference after the remapping is applied. We want to replace it with a weak reference, so this PR changes things to use operator[] instead of emplace. * Add comment 02 August 2023, 16:39:44 UTC
0839270 Attempt to fix #7703 (#7706) * Attempt to fix #7703 * fixes * Update LoopNest.cpp * Update GPULoopInfo.h * Fixes. * clang-tidy 01 August 2023, 20:55:28 UTC
831fd1a Fix RDom usage in anderson2021_test_apps_autoscheduler (Fixes #7729) (#7734) 01 August 2023, 16:36:15 UTC
3ced617 [Hexagon] - Fix problems in sim_host.cpp (#7725) * Fix problems in src/runtime/hexagon_remote/sim_host.cpp reported by clang-tidy and clang-format 01 August 2023, 14:45:00 UTC
ef51a23 Remove unused using decl (#7730) Also convert a std::vector to a vector in a file that has using std::vector 01 August 2023, 00:07:29 UTC
9f43580 Change default generator timeout to infinite (#7718) 31 July 2023, 21:51:21 UTC
f54bc08 Fix handling of thread features for scalars in Anderson2021 (#7726) * Fix handling of thread features for scalars * Remove unneeded change 31 July 2023, 21:27:01 UTC
fca8d96 Making Metal code-gen a bit faster (#7720) removing redundant print_expr() call 28 July 2023, 16:55:19 UTC
89ffae2 Making HLSL code-gen a couple orders of magnitude faster... (#7719) Removing redundant print_expr() 28 July 2023, 16:22:57 UTC
649a224 Fix CMake test for generator_aot_multitarget (#7716) * Fix CMake test for generator_aot_multitarget * Update CMakeLists.txt 27 July 2023, 22:56:50 UTC
df4c981 Throw an erorr if split is called with the same older and inner var name (#7715) * throw an erorr if split is called with the same older and inner name * update * fix naming * rewording * add test --------- Co-authored-by: Steven Johnson <srj@google.com> 27 July 2023, 15:13:02 UTC
09c5d1d Default WITH_TEST_FUZZ to OFF (#7695) * Fix for top-of-tree LLVM * Default WITH_TEST_FUZZ to OFF Just because our compiler supports fuzzing doesn't mean we want to build the fuzz tests, because they won't really build properly without the right preset specified. (This will be followed up with a change to the buildbot to set WITH_TEST_FUZZ to ON for fuzz tests) 26 July 2023, 22:25:43 UTC
bfc26cc Improved profiler result printing. (#7709) * Fixed the regularization for BGU. * Improved profiler result printing. * Clang-format ain't liking pretty code. * Clang-tidy ain't liking pretty code. --------- Co-authored-by: Steven Johnson <srj@google.com> 26 July 2023, 22:03:52 UTC
5749d8c Upgrade Halide main branch for LLVM18 (#7710) LLVM just added `release/17.x` branch and now trunk is 18 -- update our build files and docs accordingly (see also https://github.com/halide/build_bot/pull/248, which needs to land first) 26 July 2023, 20:51:48 UTC
c9bf3b1 Fix float16 warning for older clangs (#7701) 25 July 2023, 20:29:57 UTC
f41c392 Fix leaks caused by self-referential parameter constraints (#7700) * Fix leaks caused by self-referential parameter constraints * Add comment * Add missing overrides * Use const refs for non-mutated args 25 July 2023, 20:25:15 UTC
ab3ff3a Mark all single-arg ctors in src/runtime as explicit (#7707) Minor code hygiene fix, done as byproduct of #7704 25 July 2023, 20:24:20 UTC
df902e7 Mark all single-arg ctors in autoscheduler code as `explicit` (#7704) explicit ctors 25 July 2023, 19:14:12 UTC
fd9bfc8 Fix clang and llvm versions in scripts (#7702) * fix clangng+llvm versions in files * more fixes 24 July 2023, 21:45:47 UTC
ce16f91 Fixed the regularization for BGU. (#7684) Co-authored-by: Steven Johnson <srj@google.com> 24 July 2023, 18:22:51 UTC
943bc5f Convert error to warning (#7698) Accidentally checked in #7697 with the failure mode as error, not warning 24 July 2023, 18:19:50 UTC
128bcdf Add a warning if a Generator declares any Outputs before the final Input (Fixes #7669) (#7697) * Add a warning if a Generator declares any Outputs before the final Input (Fixes #7669) See https://github.com/halide/Halide/issues/7669 for details * Update abstractgeneratortest_generator.cpp * Add note about allow_out_of_order_inputs_and_outputs() to warning 24 July 2023, 17:44:03 UTC
71eb4ee Fix for top-of-tree LLVM (#7694) 21 July 2023, 00:56:11 UTC
475b774 Fix float16 under asan, attempt #2 (#7691) * Fix float16 under asan, attempt #2 Some sneakiness going on. * Update float16_t.cpp 19 July 2023, 19:13:02 UTC
0112da4 Fix quadratic algorithm in simplify_correlated_differences (#7686) This pass called expr_uses_var in a loop while building up a potentially long let chain. This does a quadratic amount of work in the size of the let chain, which stalled compilation for a particular pathological pipeline I encountered. This changes it to an eager algorithm that tracks the set of free variables and incrementally grows it instead of revisiting the entire expr for each new let added. It is n log(n) in the number of lets instead of n^2 Co-authored-by: Steven Johnson <srj@google.com> 19 July 2023, 18:27:35 UTC
18fbc15 Add Sanitizer details to README_cmake.md (#7688) 18 July 2023, 18:17:27 UTC
5f56e64 Add a select overload for tuples (#7672) * Add a select overload for tuples * Add missing overload * deprecate tuple_select * Fix Python bindings for deprecation of tuple_select() * Update PyIROperator.cpp --------- Co-authored-by: Steven Johnson <srj@google.com> 18 July 2023, 16:05:59 UTC
4ba0d8b Fix correctness_float16_t for ASAN builds (#7687) This appears to be a glitch that has to do with changing ABI for float16 across versions of GCC; we build LLVM with gcc-9 on Linux, but the float16 ABI got changed (and unified in gcc12); since ASAN builds use Clang even on linux, there is a hiccup here. This is an ugly monkey-patch to work around this issue. 18 July 2023, 16:03:37 UTC
601b5c5 Remove ParamMap (#7675) ParamMap was deprecated in Halide 16; per https://github.com/halide/Halide/pull/7357, we should go ahead and remove it for Halide 17, in favor of `compile_to_callable()`. 11 July 2023, 19:37:55 UTC
41d6d94 Update onnx app to Adams2019 autoscheduler and new autoscheduler API (#7673) * Update onnx app to Adams2019 autoscheduler and new autoscheduler API Fixes #7670 * Add model test too * Remove use of tmpnam * Don't test onnx app in a 32-bit build 11 July 2023, 16:52:18 UTC
9755e3d Attempt to fix intermittent PCH "modified" errors (#7666) * Attempt to fix intermittent PCH "modified" errors * Update CMakeLists.txt * Update CMakeLists.txt Co-authored-by: Alex Reinking <alex.reinking@gmail.com> --------- Co-authored-by: Alex Reinking <alex.reinking@gmail.com> 29 June 2023, 17:15:03 UTC
6f2cae6 Dependency wrangling part 0/N: standard CMake modules (#7658) * Hoist Threads::Threads to the top level * Remove global OpenGL dependency This is added by the helpers as-needed. Removing it here lets one build just libHalide without searching for OpenGL. * Narrow scope of OpenMP to tutorial Only the tutorial targets actually use OpenMP. Don't search for OpenMP if WITH_TUTORIALS is off. * Move JPEG and PNG deps to tools Only the Halide::ImageIO library uses these directly, so limiting the scope protects against unintented use. * Work around CMake bug The CMake $<TARGET_NAME_IF_EXISTS:...> genex uses dynamic scoping w.r.t. the target environment, rather than the usual static scoping. This means we need to move the PNG and JPEG dependencies higher up. * Add link to CMake issue in comments. 28 June 2023, 16:38:47 UTC
470f43c Bump Halide version to 17.0.0 in main (#7636) * Bump Halide version to 17.0.0 in main * Bump compatible LLVM version requirements to 17, 16, 15. Update build instructions to use newer LLVM version. * Bump clang-format/tidy LLVM version to 15 (minimum required to build Halide) * trigger buildbots * Revert LLVM requirements for run_clang_format/tidy. Do this in a separate PR. --------- Co-authored-by: Derek Gerstmann <dgerstmann@adobe.com> Co-authored-by: Steven Johnson <srj@google.com> 27 June 2023, 17:34:55 UTC
c7ca15f Enable clang-tidy's modernize-use-default-member-init check (#7662) * Upgrade clang-format and clang-tidy to use v16 (Skipping over 15 entirely in favor of the newest stable version) * Update presubmit.yml * Update .clang-tidy * Update .clang-tidy * fixes * Update run-clang-tidy.sh * Update .clang-tidy * Update .clang-tidy * fixes * Update .clang-tidy * Update PyHalide.cpp * Update run-clang-tidy.sh * Update CodeGen_Vulkan_Dev.cpp * Update .clang-tidy * fix * format 26 June 2023, 22:22:27 UTC
c28a00f Update for top-of-tree LLVM changes (#7663) 26 June 2023, 20:10:27 UTC
1e3431c Enable the misc-use-anonymous-namespace clang-tidy check (#7661) * Upgrade clang-format and clang-tidy to use v16 (Skipping over 15 entirely in favor of the newest stable version) * Update presubmit.yml * Update .clang-tidy * Update .clang-tidy * fixes * Update run-clang-tidy.sh * Update .clang-tidy * Update .clang-tidy * fixes * Update .clang-tidy * Update PyHalide.cpp * Update run-clang-tidy.sh * Update CodeGen_Vulkan_Dev.cpp * Enable the misc-use-anonymous-namespace clang-tidy check Basically just says "don't use static" * Update Generator.h * Update Util.cpp * Update JITModule.cpp 24 June 2023, 01:35:17 UTC
c2e4f6d Upgrade clang-format and clang-tidy to use v16 (#7660) * Upgrade clang-format and clang-tidy to use v16 (Skipping over 15 entirely in favor of the newest stable version) * Update presubmit.yml * Update .clang-tidy * Update .clang-tidy * fixes * Update run-clang-tidy.sh * Update .clang-tidy * Update .clang-tidy * fixes * Update .clang-tidy * Update PyHalide.cpp * Update run-clang-tidy.sh * Update CodeGen_Vulkan_Dev.cpp 24 June 2023, 01:33:46 UTC
2a93cb0 Get the ASAN toolchain working again (#7604) * Get the ASAN toolchain working again Various fixes to enable ASAN to finally work (linux x64 only). Note that this found several ASAN failures in the Anderson2021 autoscheduler tests, which are *not* fixed yet; I'll fix thus in a subsequent PR. * Remove stuff that I didn't mean to check in * Configure cuda-specific tests properly too * trigger buildbots * Update CodeGen_LLVM.cpp * Update CodeGen_LLVM.cpp * Fix sloppiness? * Update CMakeLists.txt * trigger buildbots * Use Halide_PYTHON_LAUNCHER to implement ASAN toolchain fixes (#7657) * Use new Halide_PYTHON_LAUNCHER to set env vars * Update CMake docs for Halide_SANITIZER_ENV_VARS --------- Co-authored-by: Alex Reinking <areinkin@qti.qualcomm.com> --------- Co-authored-by: Alex Reinking <alex.reinking@gmail.com> Co-authored-by: Alex Reinking <areinkin@qti.qualcomm.com> 23 June 2023, 20:53:14 UTC
0de9eb2 Fix incorrect name-mangling for llvm.experimental.vp.strided.load (#7654) These ops are only used for RISCV codegen at present, and this one tended to only happen for complex patterns that we don't test in our very limited crosscompilation tests. 23 June 2023, 17:47:03 UTC
0218c9e Add a compositing example app (#7646) * Initial version of a compositing demo app * Improve schedule; add GPU version * Better mux codegen * Consider all definition exprs in mullapudi autoscheduler * Add Tuple mux to IROperator * clang-format, better comments * Remove pointless blank line * Add some fixed-point intrinsics to RegionCosts.cpp to suppress warnings * Add perf numbers * Hopefully fix cmake build * clang-format * clang-format * Fix muxing FuncRefs * More comments * Update process.cpp * Include cmath to hopefully get M_PI * Revert inclusion of cmath --------- Co-authored-by: Steven Johnson <srj@google.com> 23 June 2023, 15:21:38 UTC
1e963ff Default RISCV backend to OFF for LLVM < 17 (#7650) LLVM17 is doing a lot of work on the RISCV backend, and the amount of testing done on Halide's LLVM16-based RISCV codegen is very light. It's been suggested that we should default to not enabling the RISCV backend for LLVM16 and earlier because of this (so that people attempting to use Halide for RISCV won't encounter a possible footgun). This PR just adds the relevant mechanism; whether or not this is the correct decision is not clear. Discussion welcome. 22 June 2023, 21:45:22 UTC
9232218 Fix RISCV codegen for top-of-tree LLVM (#7648) * Fix RISCV codegen for top-of-tree LLVM Also add a warning if you try to codegen with older versions of LLVM: many intrinsics have changed in ways that are hard to deal with both ways, and trying to support both would be painful and of dubious value. * Make LLVM16 work too * Update CodeGen_RISCV.cpp 22 June 2023, 18:20:20 UTC
back to top