https://github.com/halide/Halide

sort by:
Revision Author Date Message Commit Date
3e80492 replacing ostringstream sentinel by nil ostream 02 August 2023, 17:14:49 UTC
d964c98 fixing signatures and template instantiations 01 August 2023, 18:51:31 UTC
73947ea adding halide_stream to replace ostringstream 01 August 2023, 17:07:35 UTC
017e703 Tracy instrumentation and stringstream hooks 28 July 2023, 18:55:48 UTC
c9bf3b1 Fix float16 warning for older clangs (#7701) 25 July 2023, 20:29:57 UTC
f41c392 Fix leaks caused by self-referential parameter constraints (#7700) * Fix leaks caused by self-referential parameter constraints * Add comment * Add missing overrides * Use const refs for non-mutated args 25 July 2023, 20:25:15 UTC
ab3ff3a Mark all single-arg ctors in src/runtime as explicit (#7707) Minor code hygiene fix, done as byproduct of #7704 25 July 2023, 20:24:20 UTC
df902e7 Mark all single-arg ctors in autoscheduler code as `explicit` (#7704) explicit ctors 25 July 2023, 19:14:12 UTC
fd9bfc8 Fix clang and llvm versions in scripts (#7702) * fix clangng+llvm versions in files * more fixes 24 July 2023, 21:45:47 UTC
ce16f91 Fixed the regularization for BGU. (#7684) Co-authored-by: Steven Johnson <srj@google.com> 24 July 2023, 18:22:51 UTC
943bc5f Convert error to warning (#7698) Accidentally checked in #7697 with the failure mode as error, not warning 24 July 2023, 18:19:50 UTC
128bcdf Add a warning if a Generator declares any Outputs before the final Input (Fixes #7669) (#7697) * Add a warning if a Generator declares any Outputs before the final Input (Fixes #7669) See https://github.com/halide/Halide/issues/7669 for details * Update abstractgeneratortest_generator.cpp * Add note about allow_out_of_order_inputs_and_outputs() to warning 24 July 2023, 17:44:03 UTC
71eb4ee Fix for top-of-tree LLVM (#7694) 21 July 2023, 00:56:11 UTC
475b774 Fix float16 under asan, attempt #2 (#7691) * Fix float16 under asan, attempt #2 Some sneakiness going on. * Update float16_t.cpp 19 July 2023, 19:13:02 UTC
0112da4 Fix quadratic algorithm in simplify_correlated_differences (#7686) This pass called expr_uses_var in a loop while building up a potentially long let chain. This does a quadratic amount of work in the size of the let chain, which stalled compilation for a particular pathological pipeline I encountered. This changes it to an eager algorithm that tracks the set of free variables and incrementally grows it instead of revisiting the entire expr for each new let added. It is n log(n) in the number of lets instead of n^2 Co-authored-by: Steven Johnson <srj@google.com> 19 July 2023, 18:27:35 UTC
18fbc15 Add Sanitizer details to README_cmake.md (#7688) 18 July 2023, 18:17:27 UTC
5f56e64 Add a select overload for tuples (#7672) * Add a select overload for tuples * Add missing overload * deprecate tuple_select * Fix Python bindings for deprecation of tuple_select() * Update PyIROperator.cpp --------- Co-authored-by: Steven Johnson <srj@google.com> 18 July 2023, 16:05:59 UTC
4ba0d8b Fix correctness_float16_t for ASAN builds (#7687) This appears to be a glitch that has to do with changing ABI for float16 across versions of GCC; we build LLVM with gcc-9 on Linux, but the float16 ABI got changed (and unified in gcc12); since ASAN builds use Clang even on linux, there is a hiccup here. This is an ugly monkey-patch to work around this issue. 18 July 2023, 16:03:37 UTC
601b5c5 Remove ParamMap (#7675) ParamMap was deprecated in Halide 16; per https://github.com/halide/Halide/pull/7357, we should go ahead and remove it for Halide 17, in favor of `compile_to_callable()`. 11 July 2023, 19:37:55 UTC
41d6d94 Update onnx app to Adams2019 autoscheduler and new autoscheduler API (#7673) * Update onnx app to Adams2019 autoscheduler and new autoscheduler API Fixes #7670 * Add model test too * Remove use of tmpnam * Don't test onnx app in a 32-bit build 11 July 2023, 16:52:18 UTC
9755e3d Attempt to fix intermittent PCH "modified" errors (#7666) * Attempt to fix intermittent PCH "modified" errors * Update CMakeLists.txt * Update CMakeLists.txt Co-authored-by: Alex Reinking <alex.reinking@gmail.com> --------- Co-authored-by: Alex Reinking <alex.reinking@gmail.com> 29 June 2023, 17:15:03 UTC
6f2cae6 Dependency wrangling part 0/N: standard CMake modules (#7658) * Hoist Threads::Threads to the top level * Remove global OpenGL dependency This is added by the helpers as-needed. Removing it here lets one build just libHalide without searching for OpenGL. * Narrow scope of OpenMP to tutorial Only the tutorial targets actually use OpenMP. Don't search for OpenMP if WITH_TUTORIALS is off. * Move JPEG and PNG deps to tools Only the Halide::ImageIO library uses these directly, so limiting the scope protects against unintented use. * Work around CMake bug The CMake $<TARGET_NAME_IF_EXISTS:...> genex uses dynamic scoping w.r.t. the target environment, rather than the usual static scoping. This means we need to move the PNG and JPEG dependencies higher up. * Add link to CMake issue in comments. 28 June 2023, 16:38:47 UTC
470f43c Bump Halide version to 17.0.0 in main (#7636) * Bump Halide version to 17.0.0 in main * Bump compatible LLVM version requirements to 17, 16, 15. Update build instructions to use newer LLVM version. * Bump clang-format/tidy LLVM version to 15 (minimum required to build Halide) * trigger buildbots * Revert LLVM requirements for run_clang_format/tidy. Do this in a separate PR. --------- Co-authored-by: Derek Gerstmann <dgerstmann@adobe.com> Co-authored-by: Steven Johnson <srj@google.com> 27 June 2023, 17:34:55 UTC
c7ca15f Enable clang-tidy's modernize-use-default-member-init check (#7662) * Upgrade clang-format and clang-tidy to use v16 (Skipping over 15 entirely in favor of the newest stable version) * Update presubmit.yml * Update .clang-tidy * Update .clang-tidy * fixes * Update run-clang-tidy.sh * Update .clang-tidy * Update .clang-tidy * fixes * Update .clang-tidy * Update PyHalide.cpp * Update run-clang-tidy.sh * Update CodeGen_Vulkan_Dev.cpp * Update .clang-tidy * fix * format 26 June 2023, 22:22:27 UTC
c28a00f Update for top-of-tree LLVM changes (#7663) 26 June 2023, 20:10:27 UTC
1e3431c Enable the misc-use-anonymous-namespace clang-tidy check (#7661) * Upgrade clang-format and clang-tidy to use v16 (Skipping over 15 entirely in favor of the newest stable version) * Update presubmit.yml * Update .clang-tidy * Update .clang-tidy * fixes * Update run-clang-tidy.sh * Update .clang-tidy * Update .clang-tidy * fixes * Update .clang-tidy * Update PyHalide.cpp * Update run-clang-tidy.sh * Update CodeGen_Vulkan_Dev.cpp * Enable the misc-use-anonymous-namespace clang-tidy check Basically just says "don't use static" * Update Generator.h * Update Util.cpp * Update JITModule.cpp 24 June 2023, 01:35:17 UTC
c2e4f6d Upgrade clang-format and clang-tidy to use v16 (#7660) * Upgrade clang-format and clang-tidy to use v16 (Skipping over 15 entirely in favor of the newest stable version) * Update presubmit.yml * Update .clang-tidy * Update .clang-tidy * fixes * Update run-clang-tidy.sh * Update .clang-tidy * Update .clang-tidy * fixes * Update .clang-tidy * Update PyHalide.cpp * Update run-clang-tidy.sh * Update CodeGen_Vulkan_Dev.cpp 24 June 2023, 01:33:46 UTC
2a93cb0 Get the ASAN toolchain working again (#7604) * Get the ASAN toolchain working again Various fixes to enable ASAN to finally work (linux x64 only). Note that this found several ASAN failures in the Anderson2021 autoscheduler tests, which are *not* fixed yet; I'll fix thus in a subsequent PR. * Remove stuff that I didn't mean to check in * Configure cuda-specific tests properly too * trigger buildbots * Update CodeGen_LLVM.cpp * Update CodeGen_LLVM.cpp * Fix sloppiness? * Update CMakeLists.txt * trigger buildbots * Use Halide_PYTHON_LAUNCHER to implement ASAN toolchain fixes (#7657) * Use new Halide_PYTHON_LAUNCHER to set env vars * Update CMake docs for Halide_SANITIZER_ENV_VARS --------- Co-authored-by: Alex Reinking <areinkin@qti.qualcomm.com> --------- Co-authored-by: Alex Reinking <alex.reinking@gmail.com> Co-authored-by: Alex Reinking <areinkin@qti.qualcomm.com> 23 June 2023, 20:53:14 UTC
0de9eb2 Fix incorrect name-mangling for llvm.experimental.vp.strided.load (#7654) These ops are only used for RISCV codegen at present, and this one tended to only happen for complex patterns that we don't test in our very limited crosscompilation tests. 23 June 2023, 17:47:03 UTC
0218c9e Add a compositing example app (#7646) * Initial version of a compositing demo app * Improve schedule; add GPU version * Better mux codegen * Consider all definition exprs in mullapudi autoscheduler * Add Tuple mux to IROperator * clang-format, better comments * Remove pointless blank line * Add some fixed-point intrinsics to RegionCosts.cpp to suppress warnings * Add perf numbers * Hopefully fix cmake build * clang-format * clang-format * Fix muxing FuncRefs * More comments * Update process.cpp * Include cmath to hopefully get M_PI * Revert inclusion of cmath --------- Co-authored-by: Steven Johnson <srj@google.com> 23 June 2023, 15:21:38 UTC
1e963ff Default RISCV backend to OFF for LLVM < 17 (#7650) LLVM17 is doing a lot of work on the RISCV backend, and the amount of testing done on Halide's LLVM16-based RISCV codegen is very light. It's been suggested that we should default to not enabling the RISCV backend for LLVM16 and earlier because of this (so that people attempting to use Halide for RISCV won't encounter a possible footgun). This PR just adds the relevant mechanism; whether or not this is the correct decision is not clear. Discussion welcome. 22 June 2023, 21:45:22 UTC
9232218 Fix RISCV codegen for top-of-tree LLVM (#7648) * Fix RISCV codegen for top-of-tree LLVM Also add a warning if you try to codegen with older versions of LLVM: many intrinsics have changed in ways that are hard to deal with both ways, and trying to support both would be painful and of dubious value. * Make LLVM16 work too * Update CodeGen_RISCV.cpp 22 June 2023, 18:20:20 UTC
bd42076 Add user_assert for zero vector width in CodegenRISCV (#7647) * Add user_assert for zero vector width in CodegenRISCV If you forget to add `-rvv-vector_bits_N` to your Target string, we try to codegen with a vector width of 0, which (unsurprisingly) craters in many places which assume a nonzero value. It's pretty unlikely anyone wants to use Halide to codegen to a RISCV core that lacks SIMD, so let's add a more helpful failure message for this easy-to-make error (we can revisit this later if it actually is desirable for some reason.) (I looked briefly at trying to clean up all the places in CodegenLLVM, etc, that make that assumption, but it quickly turned into a rat's nest; it's definitely fixable if we want to support this in the future, but, again, I suspect we don't.) * Update CodeGen_RISCV.cpp 21 June 2023, 22:48:44 UTC
8acdc46 Be more careful about overflow in trim_bounds_using_alignment (#7645) * Be more careful about overflow in trim_bounds_using_alignment Fixes #7575 * trigger buildbots --------- Co-authored-by: Steven Johnson <srj@google.com> 20 June 2023, 22:40:21 UTC
3b7e83a Alternative fix for #4211 (#7628) * Alternative fix for #4211 Call::Prefetch evaluates to a currently-unspecified value of the prefetched type. Let's just make it zero. * Fix prefetch_2d * Fix CodeGen_C * Fix CodeGen_C * trigger buildbots --------- Co-authored-by: Steven Johnson <srj@google.com> 17 June 2023, 00:21:19 UTC
2149734 Revise LLVM fix to work when no V8 or WABT available (#7635) * Revise LLVM fix to work when no V8 or WABT available * Update WasmExecutor.cpp * Update WasmExecutor.cpp * Update WasmExecutor.cpp 15 June 2023, 00:48:09 UTC
932ad0b Deprecate OpenGLCompute for Halide 16 (#7627) * Deprecate OpenGLCompute for Halide 16 * clang-format 14 June 2023, 17:15:18 UTC
1f5b207 Fix wasm linker for top-of-tree LLVM (#7634) 13 June 2023, 23:55:36 UTC
37fd8c4 Bump HALIDE_VERSION_MAJOR to 16 in makefile in prep for release (#7632) Co-authored-by: Derek Gerstmann <dgerstmann@adobe.com> 13 June 2023, 20:34:48 UTC
fa3d87c Fix inverted may_subtile checks (#7626) 13 June 2023, 16:31:34 UTC
bd62a35 Significant change to RISC V and scalable vector code generation. (#7616) * Completely rework how RISC-V vector intrinsics are called to avoid issues iwth single element vectors being confused with scalars and other conversions that can happen via using call_intrin. Allows using any size vector. Only downside is splitting large vectors no longer happens, but RISC-V allows an LMUL of 8, meaning a vector of up to 8 times the vector register size will compile so this is much less of an issue. Splitting larger vectors can be added. Should also allow fractionaly LMUL in all cases, but this is not verified. * Significant refactor/rewrite of RISC V vector intrinsics support. Should handle many more cases and be well on the way to handling arbitary vector widths within the LMUL range. More tests added to simd_op_check_riscv . Likely well setup to move SVE2 to a similar approach, perhaps without the full genearilty on vector lengths. (I.e. they may need to be quantized to vscale, or offer better performance in that case.) * Formatting fixes. * More formatting. * Don't try to convert void types to match expected vector type. * Backout comment change that is no longer relevant. * Fix failure in camera_pipe app. (Code to make sure vector types match was being presented with a scalar only mismatch. Changed it to ignore scalar to scalar cases.) Address review feedback. * One more review comment. * Comment fix. --------- Co-authored-by: Steven Johnson <srj@google.com> 08 June 2023, 01:18:19 UTC
67eaff3 Upgrade our PyBind11 version to 2.10.4 (#7617) (#7618) * Upgrade our PyBind11 version to 2.10.4 (#7617) * Forgot to save 07 June 2023, 00:30:45 UTC
123d855 Fix PCH build failures (#7613) * Fix PCH build failures (Harvested from #7604 to land separately) * Update CMakeLists.txt 06 June 2023, 16:28:34 UTC
ffd20c9 Revert "[Hexagon] Fix compilation failures hexagon_remote" (#7614) Revert "[Hexagon] Fix compilation failures hexagon_remote (#7601)" This reverts commit bd33a629adfd89129d21fe82e68fc7d20f935283. 05 June 2023, 20:33:21 UTC
2304dd8 Add missing deps for some autoscheduler tests (#7605) * Add missing deps for some autoscheduler tests Autoscheduler tests that rely on the relevant shared library being available at runtime need to add a dependency to ensure this is the case. * Update CMakeLists.txt * Update test.cpp * Update CMakeLists.txt * Update CMakeLists.txt * Update test/autoschedulers/li2018/CMakeLists.txt Co-authored-by: Alex Reinking <alex.reinking@gmail.com> --------- Co-authored-by: Alex Reinking <alex.reinking@gmail.com> 05 June 2023, 19:16:59 UTC
51e4e04 Add target triple setup for RISC V Android. (#7612) Add target triple setup for RISC V Android. More guess based than test based but I'm 80% confident these are good choices. Data layout seems to be the same as well. 05 June 2023, 19:16:21 UTC
7e57438 Silence `psabi` warnings when compiling C++ generated code (#7603) * Silence `psabi` warnings when compiling C++ generated code Some versions of GCC/Clang emit many of these warnings when compiling in some Intel configurations, and they are useless in this context. Make them go away. * Update cmake/HalideGeneratorHelpers.cmake Co-authored-by: Alex Reinking <alex.reinking@gmail.com> --------- Co-authored-by: Alex Reinking <alex.reinking@gmail.com> 05 June 2023, 16:59:48 UTC
9ee2d0c Update Compiler/OS versions in README (#7610) 03 June 2023, 00:34:54 UTC
f3e1829 Adds fuzzing preset (#7566) * Adds fuzzing preset Partial fix for #7552 * Adds documentation on fuzz testing Closes: #7552 * Fixes spelling/grammar in fuzzing readme Co-authored-by: Alex Reinking <alex.reinking@gmail.com> * Remove asan flags from fuzzer * Add build directory in cmake/fuzzing documentation * Configure the fuzz tests to run for a finite amount of time * Update README * Update README_fuzz_testing.md * trigger buildbots * trigger buildbots * trigger buildbots * Update CMakeLists.txt --------- Co-authored-by: Steven Johnson <srj@google.com> Co-authored-by: Alex Reinking <alex.reinking@gmail.com> 01 June 2023, 20:46:49 UTC
4991231 Disable fuzzer when using ASAN (#7602) 01 June 2023, 19:52:02 UTC
bd33a62 [Hexagon] Fix compilation failures hexagon_remote (#7601) Fix for: 1. Include directory for pthread.h 2. Function signature for qurt_hvx_lock Co-authored-by: Ankit Aggarwal <aankit@quicinc.com> 01 June 2023, 18:55:31 UTC
3e6e2a5 Fix operator/ on ModulusRemainder (#7597) It wasn't reducing the remainder modulo the modulus, which confused trim_bounds_using_alignment in the simplifier. 01 June 2023, 17:36:56 UTC
eb9b946 Apply fix from #7564 to fuzz/bounds (#7596) (Avoids infinite loop for some fuzzing inputs) 30 May 2023, 22:39:41 UTC
b450647 [Fix for #7524] Skip tests for anderson2021 if PTX is not enabled (#7593) Skip tests for anderson2021 if PTX is not enabled 25 May 2023, 21:07:38 UTC
ca8ca00 Pacify clang-tidy by removing unused constant (#7590) 24 May 2023, 20:14:29 UTC
6a98655 fuzz: Add libfuzzer compatible bounds fuzzer (#7549) * fuzz: Add libfuzzer compatible bounds fuzzer * Remove unused constant * Style fix * Fix handling of binary ops * Handle casting to vector-of-bool properly * fuzz: Alphabetically sort targets in CMake --------- Co-authored-by: Steven Johnson <srj@google.com> 22 May 2023, 17:16:28 UTC
d234143 Check for slightly different error msg in AppleClang 14.0.3 (#7582) * Check for slightly different error msg in AppleClang 14.0.3 * Update Makefile 18 May 2023, 19:04:34 UTC
02768ef In fuzz/simplify, output errors to cerr, not cout (#7583) * In fuzz/simplify, output errors to cerr, not cout This makes it easier to capture error output in downstream test harnesses * Also add some more helpful text 18 May 2023, 17:43:49 UTC
4282a5d Fix #7579 (#7580) Fix per @jrprice. (He comments that we should probably regenerate all of mini_webgpu.h, and document how to do that; this is a band-aid to unbreak testing.) 18 May 2023, 17:05:36 UTC
2ed955e Fix various compilation errors with AppleClang 14.0.3 (#7578) * Change & -> && usage Newer versions of Xcode trigger `-Wbitwise-instead-of-logical` for this usage, which we treat as an error * Also fix `error: variable 'i' set but not used [-Werror,-Wunused-but-set-variable]` * Also fix `retrain_cost_model.cpp:419:17: error: variable 'counter' set but not used [-Werror,-Wunused-but-set-variable]` 18 May 2023, 00:49:25 UTC
6c8f7aa [vulkan] Fix subregion memory offsets to respect buffer alignment (#7576) * Fix buffer alignment constraints for subregion allocations (some drivers report a minimum alignment for the buffer that is larger than the storage or uniform storage offset alignemnt) Cleanup region offset and size constraints * Clang tidy/format pass --------- Co-authored-by: Derek Gerstmann <dgerstmann@adobe.com> 17 May 2023, 23:05:00 UTC
30d309e [vulkan] Change the feature version requirement to v1.3 for correctness_gpu_dynamic_shared (#7577) Change the feature version requirement to v1.3 (since v1.2 lacks the necessary support). Co-authored-by: Derek Gerstmann <dgerstmann@adobe.com> 17 May 2023, 23:04:45 UTC
2fd90bf [vulkan] Disable generator acquire_release test for Vulkan (#7565) Disable test for Vulkan Co-authored-by: Derek Gerstmann <dgerstmann@adobe.com> 17 May 2023, 18:47:36 UTC
968e52c Upgrade WABT to 1.0.33 (#7570) * Upgrade WABT to 1.0.33 * Update CMakeLists.txt * Update CMakeLists.txt 17 May 2023, 17:09:47 UTC
76bb84d Allow autoconversion from `Buffer<T>` -> `Buffer<const T>&` and to `Buffer<void>&` (#7571) * Allow autoconversion from `Buffer<T>` -> `Buffer<const T>&` When you are intermixing CPU and GPU calls in a single piece of code, it's preferable to pass `Buffer<>` by nonconst reference, so that lazy host<->device copies are done efficiently. However, many callers prefer to define input Buffers as `Buffer<const T>` (as they should), but the fact that this form didn't easily allow autoconversion from caller (whihc may well have constructed the buffer as non-const) to callee (due to incompatible type references) led some users to just pass by a copy, since these autoconverted. This had a couple of undesirable effects: - Making a copy cost a small but nonzero amount of code (managing refcounts, etc) - More importantly, lazy copies in the callee got 'lost' to the caller, since the `halide_buffer_t` in the callee was a copy, thus any added `device` value or change in dirty bits was never seen. This could previously be worked around by adding explicit calls to `.as_const()`, but that is ugly and awkward. This change adds an ugly-but-safe implicit-conversion overload, to allow converting `Buffer<T>&` to `Buffer<const T>&`, iff T isn't already const. This will allow cleaning up downstream code to pass by references more consistently, without needing to add `.as_const()` warts. * Also add convenience conversions for Buffer<void>& 16 May 2023, 18:25:08 UTC
f121abf Fix save_tiff() PlanarConfig assignment for monochrome inputs (#7568) Fixes #7567. 15 May 2023, 23:47:02 UTC
ae53d9b Avoid potentially infinite loop in fuzz/simplify.cpp (#7564) FuzzedDataProvider is *not* a RNG; there's no guarantee that it won't return the same data to you forever. This means that the loop to find a new subtype may never terminate (eg if the 'random' type returned always matches the input type). This "fixes" it by just adding a count to break out of the loop, in which case we just use the original type. Not sure if there's a more elegant fix? 12 May 2023, 18:18:20 UTC
e0ef57a Remove unique_name() usage from fuzz/cse (#7563) 12 May 2023, 16:37:40 UTC
252c4b8 Add/augment some runtime debug output (#7561) - in `halide_buffer_to_string()`, print the `halide_buffer_t*` pointer value as well - in `debug_log_and_validate_buf()`, do debug logging for some failure modes that return errors 11 May 2023, 17:07:30 UTC
afea893 [vulkan] Disable performance_wrap test for Vulkan ... results don't match (#7560) * Fix missing initializer for vulkan memory config that got munged in a previous merge. This gets the correctness_multiple_outputs test to pass. * Disable test for Vulkan since shared memory results are incorrect (see issue #7559) * Clang tidy/format pass --------- Co-authored-by: Derek Gerstmann <dgerstmann@adobe.com> 10 May 2023, 15:55:38 UTC
53de4ce Fix #7556 (#7557) * Fix #7556 * Update cast.cpp * Add user_assert that type lanes match * Revert "Add user_assert that type lanes match" This reverts commit e1f34e0c3098a4952af64ae88632bb2ada9763b1. 09 May 2023, 18:24:07 UTC
8f22013 Followup to #7551 for bool vectors (#7555) Need to cast to a type that is bool-with-lanes, not scalar bool 09 May 2023, 17:00:40 UTC
acde515 [vulkan] Fix missing initializer for vulkan memory config (#7554) Fix missing initializer for vulkan memory config that got munged in a previous merge. This gets the correctness_multiple_outputs test to pass. Co-authored-by: Derek Gerstmann <dgerstmann@adobe.com> 09 May 2023, 17:00:12 UTC
763d207 Fix fuzz/cse to avoid signed_integer_overflow() results (#7553) * Fix fuzz/cse to avoid signed_integer_overflow() results * Update cse.cpp 08 May 2023, 23:16:57 UTC
7afb343 Fix errors in fuzz/simplify.cpp (#7551) * Style Fix: don't use uppercase-T for non-template arguments * Boolean ops need extra type coercion 08 May 2023, 20:41:31 UTC
fb71862 Fix unused-thing warnings in fuzz/simplify.cpp (#7548) * Fix unused-thing warnings in fuzz/simplify.cpp * Update simplify.cpp 08 May 2023, 01:39:41 UTC
c86d418 fix(fuzz): Refactor fuzzers to fix off by 1 errors (#7547) Cleanup the fuzzers making them more readable and fix off by one errors caused by incorrect usage of FuzzedDataProvider::ConsumeIntegralInRange. Closes: #7546 05 May 2023, 01:33:25 UTC
dff1e38 Remove workaround for GCC 4.x.x in cpuid() (#7545) * Remove workaround for GCC 4.x.x 02 May 2023, 20:38:17 UTC
96acbc6 Workaround for Issue #7539 (#7540) * Workaround for Issue #7539 Partial fix for now * trigger buildbots 02 May 2023, 20:37:47 UTC
05316af metal : replacing spinlock by mutex (#7532) replacing spinlock by mutex 02 May 2023, 19:01:29 UTC
2945c71 fuzz: Port correctness/cse fuzzer over to libfuzzer (#7543) 02 May 2023, 16:22:26 UTC
7cdbc71 Rework CMake interface for Dawn/Node bindings (#7422) AOT pipelines that rely on Dawn/WebGPU now depend on a new Halide_WebGPU find-module. This module honors the make-ish HL_WEBGPU_NATIVE_LIB variable as a means of initializing the Halide_WebGPU_NATIVE_LIB cache variable. This is automatically handled by add_halide_generator and add_halide_runtime and is available to downstreams. The JIT tests no longer read the HL_WEBGPU_NODE_BINDINGS environment variable during the CMake configure or build phase. Instead, a test launcher reads it at CTest runtime. Co-authored-by: Alex Reinking <quic_areinkin@quicinc.com> 02 May 2023, 04:00:16 UTC
6db47d3 Fix flag check for fuzzers (#7542) On some system size_t isn't available under <cstdint>, however it is garaunteed to be available under <cstddef> for all systems. 01 May 2023, 23:46:52 UTC
38ed15d Fix some autoscheduler build errors (#7538) - Remove inadvertent duplicate of PerfectHashMap.h from adams2019 - add some missing #includes - never pass negative values to exit() 01 May 2023, 17:08:54 UTC
044a8cf Add libfuzzer compatible fuzz harness (#7512) 01 May 2023, 14:00:36 UTC
244e72c Avoid endless loop in msan + zero-extent buffer (#7536) With MSAN enabled, we use `make_buffer_copy()` to build an efficient way to check the poison bits on buffers; unfortunately, if you are checking a buffer that has at least one dimension with zero-extent but nonzero-stride, the final while loop will never terminate. Add a trivial check so that it exits. 26 April 2023, 22:37:36 UTC
4d86539 [vulkan phase2] Vulkan Runtime (#6924) * Import Vulkan runtime changes from personal branch * Fix build to work with latest changes in main * Hookup Vulkan into Target, DeviceInterface and OffloadGPULoops * Add Vulkan runtime to Makefile * Add Vulkan target to Python bindings * Add runtime linker support to target Vulkan CodeGen * Add Vulkan windows decorator to runtime targets * Wrap debug messages for internal runtime classes with DEBUG_INTERNAL Error on failed string termination * Silence clang-tidy warnings for redundant expressions on Vulkan enum values * Clang tidy & format pass * Fix formatting for single line statements * Move Vulkan option to top-level CMakeLists.txt and enable SPIR-V as needed * Fix Vulkan & SPIRV dependencies for makefile * Add Halide version info to Makefile Add HALIDE_VERSION compiler definitions to compilation * Add HL_VERSION_FLAGS to RUNTIME_CXX_FLAGS * Finish refactoring of Vulkan CodeGen to use SpirV-IR. Added splitmix64 based hashing scheme for types and constants. Numerous fixes to instruction packing. Added debug symbols to all variables. * Clang tidy/format pass. * Fix formatting * Remove leftover ifdef * Fix build error for clang OSX for mismatched type comparison * Refactor loops and conditionals to use blocks * Clang tidy/format pass * Add detailed comments for acquire context parameters * Add comments describing loader method exports and dynamically resolved function pointers Other minor cleanups * Change aborts to debug asserts for context parameters. Add error handling to acquire context. * Cache Vulkan descriptor sets and other shader module objects in compilation cache for reuse * Replace platform specific strncpy for grabbing Extension strings with StringUtils::copy_upto * Enable device features for selected device * Fix alignment constraints for to match Vulkan buffer memory requirements. Add env vars to control Vulkan Memory Allocator config. * Add Vulkan to list of supported APIs in README.md Add Vulkan specific README_vulkan.md * Clang tidy/format pass * Fix conform_alignment to handle zero values * Fix declaration of custom_allocation_callbacks to be static. Change to constexpr for invalid values * Whitespace change to trigger build. * Handle Vulkan kernels that don't require storage buffers. Updated test status. Fixes 7 test cases. * Add src/mini_vulkan.h Apache 2.0 license requirements to License file * Add descriptor set binding info as pre-amble to SPIR-V code module Fix shared memory allocation to use global variables in workgroup storage space Add extern calls for spirv and glsl builtins Add memory fence call to gpu thread barrier Add missing visitors to Vulkan CodeGen Add scalar index & vector index methods for load/store * Clang tidy & format pass * Update test results for Vulkan docs. Passing: 326 Failing: 39 * Fix formatting * Remove extraneous parentheses for is_array_type() * Add Vulkan library to linkage fo Halide generator helpers * Add SPIR-V formatted output (for debugging) * Only declare SIMT intrinics that are actually used. Cleanup & refactor add_kernel method. * Add Vulkan handler to test targets * Clang format/tidy pass * Add doc-strings to SPIR-V interface * Adjust runtime array to widest vector width based on alignment and dense vector loads/stores Fix scalar and vector load/stores Fix casts for vectors Add missing nan, inf, neg_inf, is_finite builtins * Add missing bitwise and logical and methods. Cleanups. * Add comments about necessary packages on Ubuntu v22.04 vs earlier versions * Clang tidy & format pass. * Update Vulkan test results. Pass: 329 Fail: 36 * Remove unused Produce/Consume visitor method * Fix Molten VK initialization to work with v1.3+ loader Add support for direct casts for same-size types Add missing mux, mix, lerp, sinh, tanh, etc intrinsics Add explicit storage access for variables Add a macro to enable debug messages in Vulkan Memory Allocator * Disable dynamic shared memory portion of test for Vulkan (since its not supported yet) * Disable uncached portion of test for Vulkan (since it may OOM) * Disable float64 support in Type::supports_type() for Vulkan target since it's not widely supported * Fix Shuffle to handle all known cases Hookup VulkanMemoryAllocator to gpu allocation cache. Fix if_then_else to allow calls and statements to be used Fix loop counter comparison, and don't allow dynamic loops to be unrolled. Fix scalarize to use CompositeInsert instead of VectorInsertDynamic Fix FMod to use FRem (cause SPIR-V's FMod doesn't do what you'd expect ... but FRem does?!) Use exact same sematics for barriers as GLSL Compute ... still not passing everything Fix SPIR-V block termination checks, keys for null constants, and other cleanups * Clang tidy & format pass * Update correctness test results. PASS: 338, FAIL: 27 * Move counter inside debug #define to fix build * Relax tolerance for newton's method to match other GPU APIs Skip gpu dynamic shared testfor Vulkan (since dynamic shared allocations aren't supported yet) Update correctness test status. PASS: 340, FAIL: 25 * Clang format/tidy pass * Skip Vulkan for float64 for correctness test round (since f64 is optional) * Skip Vulkan for tests that rely upon device crop, and slice. * Only test small vector widths for Vulkan (since widths >=8 are optional) * Caninicalize gpu vars for Vulkan * Fix loop initialization, and increments Add all explicit types, and fix constant declarations Add missing fast intrinsics Convert results of logical ops into expected types (instead of bools) * Add SpvInstruction::add_operands(), add_immediates() and template based append() Make integer logical operations explicit. Better handling of constant data. * Clang format & tidy pass * Fix windows build ... refactor convert_to_bool to use std::vectors rather than dynamic fixed sized arrays * Skip asyn_device_copy, device_buffer_copy, device_crop, and device_slice tests for Vulkan (for now). * Don't test large vector widths for Vulkan (since they are optionally supported) * Clear Vulkan buffer allocations prior to use (tbd if this is necessary) * Skip Vulkan for async copy chain test * Skip Vulkan for interpreter test * Clang tidy/format pass * Fix formatting * Fix build ... use error messages for errors * Separate shared memory resources by element type for Vulkan. * Add Vulkan to conditional for fusing gpu loops * Reorder reset method to match declaration ordering. * Cleanup debug log messages for Vulkan resources * Assert alignment is power of two * Only split regions that have already been freed. Add more debug messages to log * Explicitly cleanup Vulkan command buffers as after they are used Avoid recreating descriptor sets Tidy up Vulkan debug messages * Fix Div, Mod, and div_round_to_zero for integer cases Cleanup reset method * Skip Vulkan for async_copy_chain * Skip 64-bit values on Vulkan since they are optionally supported * Skip interleave_rgb for Vulkan (which doesn't support cropping) * Skip interpreter for Vulkan (which doesn't support dynamic allocation of shared mem). * Clang Tidy/Format pass * Handle calls to pow with negative values for Vulkan Add integer and float constant helpers to SPIRV * Only test real numbers for pow with Vulkan * Clang tidy/format pass * Fix logic so a region request of an entire block matches if exactly the same size as an empty block * Create a zero size buffer to check for alignment Return null handles after freeing * Add more verbose debug output for malloc * Fix UConvert logic to avoid narrowing an integer type less than 8 bits Remove optimization path for division which seems to fail worse than DIV Cleanup DIV and MOD operators * Clang format/tidy pass * Fix SConvert & UConvert ops * Add retain semantics to block allocator interface Update test to validate retain/release/reclaim functionality * Implement device_crop, device_slice and release_crop for Vulkan. Re-enable device_crop, device_slice and interleave_rgb tests. * Clang format/tidy pass * Implement device copy for Vulkan. Enable device copy test. * Clang format/tidy pass * Fix signed mod operator and use euclidean identity (just like glsl) * Clang format/tidy pass * Fix to handle Mod on vectors (use vector constant for bitwise and) * Fix pow operator for Vulkan, and re-enable math test to full range. * Add error checking for return types for conditionals Use bool types for ops that require them, and adapt to expected return types * Handle deallocation for existing regions prior to coalescing. Cleanup region allocator logic for availability. Augment block_allocator test to cover allocation reuse. * Clang tidy/format pass * Fix reserved accounting for regions * Add more details to Windows specific Vulkan build config * Update SPIR-V headers to v1.6 * Add support for dynamic shared memory allocations for Vulkan Add dynamic workgroup dispatching to Vulkan Add optional feature flags for Vulkan capabilities Add Vulkan API version flags for target features Enable v1.3 path if requested Re-enable tests for added features Update Vulkan docs with status updates and feature flags * Enable Vulkan asyc_device_copy test. * Disable Vulkan performance test for async gpu (for now). * Disable Vulkan from python AOT tests and tutorials (since it requires linkage against the vulkan loader system library). * Update Vulkan readme with latest status. Everything works! More or less. =) * Clang format pass * Cleanup formatting for Halide version info in Makefile * Fix typos and address review comments for Vulkan readme * Change value casts to match Halide conventions * Fix typos in comments * Add static_assert to rotl to make compilation errors clearer (instead of using enable_if) Fix debug(3) formatting to avoid super long messages Use lookup table for SPIR-V op code names * Fix typos and logic for Vulkan capabilities * Remove leftover debug ifdef * Fix typo in comments * Rename copy_upto(...) method to be copy_up_to(...) * Handle error case for uninitialized buffer allocation (rather than abort) Fix typos in comments * Support any arbitary number of devices and queues for context creation Fix typos in comments * Add get/set alloc_config methods and API hooks for configuring the VulkanMemoryAllocator * Remove leftover debug ifdef * Hookup API methods for get/set alloc_config when initializing the VulkanMemoryAllocator * Remove empty lines in main * Add required capability flags for 8-bit and 16-bit uniform and storage buffer access Handle casts for GLSL ops (spec requires all args to be the same type as the return type) * Add VkPhysicalDevice8BitStorageFeaturesKHR and related constants * Query for 8-bit and 16-bit uniform and storage access support. Enable these as part of the device feature query chain. * Use VK_WHOLE_SIZE for setting buffer (to pass validation ... otherwise size has to be a multiple of alignment) Remove useless debug asserts for static variables Fix debug logging messages for allocations of scalars (which may not have a dim array) * Query for device limits to enforce min alignment constraints for storage and uniform buffers * Fix shutdown sequence to iterate over descriptor sets Avoid bug in validation layer by reordering destruction sequence * Clang format & tidy pass * Fix logic for locating entry point shader binding ... assume exact match for entry point name Cleanup entry point binding variables and clarify usage * Remove accidentally uncommented debug statements * Cleanup debug output for buffer related updates * Fix split and allocate methods in region allocator to fix issues with alignment constraints - discovered a hang if requested size couldn't be fulfilled after adjusting to aligned sizes - cause was incorrect splitting of existing regions Cleanup region allocator iteration, cleanup and shutdown Added maximum_pool_size configuration option to Vulkan Memory Allocator to restrict pool sizes * Added notes about TARGET_VULKAN=ON being the default now Added links to LunarG MoltenVK SDK installer, and brew packages * Fix markdown formatting * Fix error code handling in Vulkan runtime and internal datastructures. Refactor all (well nearly all) return values to use halide error codes. Reduce the usage of abort_if() for recoverable errors. * Fix typo in error message * Fix typo in readme * Skip GPU allocation cache test on MacOSX since MoltenVK only supports 30 buffers to be allocated * Skip widening reduction test on Vulkan for Mac OSX/IOS since MoltenVK fails to translate calls with vector types for builtins like min/max. etc * Skip doubles in vector cast test on Vulkan for Mac OSX/IOS since Molten doesn't support them * Skip gpu_dynamic_shared and gpu_specialize test for Vulkan on Mac OSX/IOS since MoltenVK doesn't support the dynamic shared memory allocation or dynamic grid size. * Clang format / tidy pass * Resolve conflicts for mini_webgpu.h ... revert to main * Use unique intrinsic var names for each kernel Cleanup constant value declarations with template helper methods Add comments on workgroup size usage * Wrap debug output under ifdef DEBUG_RUNTIME_INTERNAL macro guard Add nearest_multiple constraint to block/region allocator * Add vk_clear_device_buffer utility method Add nearest_multiple constrating to vulkan memory allocatori + fixes correctness/multiple_outputs test Add vkCreateBuffer/vkDestroyBuffer debug output i + for gpu_object_lifetime_tracker Cleanup shutdown for shader_module destruction * Add note about nearest_multiple constraint for vulkan memory allocator * Hookup gpu_object_lifetime_tracker with Vulkan debug statements * Skip dynamic shared memory portion of test for Vulkan on iOS/OSX. * Fix stale comment for float type support. Fix incorrect lowering for intrinsic. --------- Co-authored-by: Derek Gerstmann <dgerstmann@adobe.com> Co-authored-by: Steven Johnson <srj@google.com> 25 April 2023, 00:21:15 UTC
fcddcf8 metal : replacing `arg_sizes` by `arg_types` in kernel run interface (#7505) * replacing arg_sizes by arg_types * build fix * allocating and computing arg_sizes[] on the stack * clang-format * zero termination oopsie! * special case when argument is a buffer * telling runtime to pass argument types instead of argument sizes to the kernel run call * args[i] could well be 0! * removing arg_sizes[] * addressing code review comments --------- Co-authored-by: Marcos Slomp <slomp@adobe.com> Co-authored-by: Steven Johnson <srj@google.com> 24 April 2023, 13:13:21 UTC
e55834b Fix Anderson2021 tests to avoid spurious failures on non-Cuda systems (#7518) * Fix Anderson2021 tests to avoid spurious failures on non-Cuda systems The Anderson2021 autoscheduler is pretty Cuda-specific, so some tests assume it is present; this is pretty much never true on macOS, and annoying spurious failures are annoying. This adds a new flag and capability to RunGenMain to try to sniff out the necessary runtime setup and make it a quiet [SKIP] failure when testing. * Use set instead of strstr() * Update LoopNest.cpp * Update RunGenMain.cpp * Update RunGenMain.cpp * Update RunGenMain.cpp * Update RunGenMain.cpp * trigger buildbots * Update RunGenMain.cpp 24 April 2023, 01:06:59 UTC
93a5887 Make stmt_html generation work correctly for submodules (#7522) * Don't erase stmt_html before resolving submodules * Fix stmt_html for submodules 20 April 2023, 17:58:33 UTC
294f80c Forbid assigning to Buffer(Expr) by introducing an intermediate type. (#7517) * Forbid assigning to Buffer(Expr) by introducing an intermediate type. Fixes #7514 * Simpler solution * Silence clang-tidy 19 April 2023, 23:26:50 UTC
8670a25 Fix for top-of-tree LLVM (#7523) 19 April 2023, 18:28:11 UTC
2527c35 Don't accidentally embed .s files in .a files when emitting stmt_html (#7520) * Don't accidentally embed .s files in .a files when emitting stmt_html Followup fix for #7516 * format 18 April 2023, 21:31:56 UTC
42e71f2 Convert stmt_html output to use stmt_viz output (#7516) * Allow emitting `stmt_viz` without specifying `assembly` TL;DR: if we request `stmt_viz` without `assembly`, just generate the latter to a temp file that we dispose of later; this wasn't feasible before since we were previously requiring the assembly output to be generated with the same directory and basename as stmt_viz, but that was fixed. * Convert stmt_html output to use stmt_viz output Per discussion on #7507, this entirely removes the "classic" stmt_html output and replaces it with the "new" StmtToViz output. Using `compile_to_lowered_stmt` or requesting `stmt_html` will now always output the new output, and requesting `stmt_viz` output is no longer legal. (Note that this builds on top of #7515, which must be submitted first.) It's not clear to me whether https://github.com/halide/Halide/issues/7507#issuecomment-1511761706 is a blocker for this change, or a request to add back already-lost functionality. * Update Makefile * Update Generator.cpp 18 April 2023, 16:28:26 UTC
8efc688 Allow emitting `stmt_viz` without specifying `assembly` (#7515) TL;DR: if we request `stmt_viz` without `assembly`, just generate the latter to a temp file that we dispose of later; this wasn't feasible before since we were previously requiring the assembly output to be generated with the same directory and basename as stmt_viz, but that was fixed. 17 April 2023, 22:22:44 UTC
c9c85dc Improve assembly-file finding logic in StmtToViz (#7513) (1) Avoid having to guess at location by just passing in the location, since we usually already know it. (2) If we don't know it, be more cautious when constructing it: the output html filename might not match our expectations, and all file extensions must use get_output_info() to work correctly on all platforms. 16 April 2023, 00:43:43 UTC
04f09d4 Add error message when casting multi-element Realization to Buffer (#7506) * Add error message when casting multi-element Realization to Buffer Fixes #7504 * Add missing test 14 April 2023, 16:58:23 UTC
e20d798 Add build number to Python wheel before uploading (#7500) * Add build number to Python wheel before uploading This change adds a build number based on GitHub Actions' `github.run_id` to the Python wheel before uploading. This should work around the issue that causes the uploads to fail currently. Fixes #7293 * fixup! Add build number to Python wheel before uploading 13 April 2023, 19:17:43 UTC
bea0075 Deprecate ParamMap (#7121) (#7357) * Deprecate ParamMap (#7121) This PR deprecates ParamMap for Halide 16, with the plan of removing it entirely for Halide 17; it was added to provide a threadsafe way to provide parameteres to the JIT, but `compile_to_callable()` now does this in a much less intrusive way. * Updated comments, removed mutexes (mutices?) * formatting * Go back to HALIDE_ATTRIBUTE_DEPRECATED 13 April 2023, 17:55:43 UTC
e7f7860 d3d12: enforce weak linkage (#7496) * ensuring all symbols are weak, or static constexpr, to allow for merging runtimes together * clang-format fluke --------- Co-authored-by: Marcos Slomp <slomp@adobe.com> 12 April 2023, 16:13:50 UTC
back to top