https://github.com/halide/Halide

sort by:
Revision Author Date Message Commit Date
a94e3ee Merge branch 'rootjalex/improve_cbounds_fixed' of github.com:halide/Halide into rootjalex/test_cbounds_fixed 17 May 2022, 19:46:50 UTC
6c1a930 performance improvements for new constant bounds methods 17 May 2022, 19:46:26 UTC
42ddfa1 Merge branch 'rootjalex/improve_cbounds_fixed' of github.com:halide/Halide into rootjalex/test_cbounds_fixed 16 May 2022, 18:29:12 UTC
75d2b7a clang format 16 May 2022, 18:28:55 UTC
649a3c1 Merge branch 'rootjalex/improve_cbounds_fixed' of github.com:halide/Halide into rootjalex/test_cbounds_fixed 16 May 2022, 17:20:26 UTC
853bcbd Merge branch 'main' of https://github.com/halide/Halide into rootjalex/improve_cbounds_fixed 16 May 2022, 17:12:33 UTC
09a986e Expand the x86 SIMD variants tested in correctness_vector_reductions (#6762) A recent bug in LLVM codegen was missed because it only affected x86 architectures with earlier-than-AVX2 SIMD enabled; it didn't show up for AVX2 or later. This revamps correctness_vector_reductions to re-run multiple times when multiple SIMD architectures are available on x86 systems. (correctness_vector_reductions was chosen here because it reliably demonstrated the specific failures in this case.) 13 May 2022, 20:56:47 UTC
4ab4ad9 Minor metadata-related cleanups (#6759) (Harvested from #6757, which probably won't land) - Add clarifying comment/reference in Generator - Add assertion to compile_to_multitarget() function - Fix misleading/wrong code in correctness_compile_to_multitarget 13 May 2022, 01:31:38 UTC
b38b661 Deprecate disable_llvm_loop_opt (#4113) (#6754) This PR proposes to (finally) deprecate disable_llvm_loop_opt: - make LLVM codegen default to no loop optimization; you must use enable_llvm_loop_opt explicitly to enable it - disable_llvm_loop_opt still exists, but does nothing (except issue a user_warning that the feature is deprecated) - Remove various uses of disable_llvm_loop_opt - Add comments everywhere that the default is different in Halide 15 and that the disable_llvm_loop_opt feature will be removed entirely in Halide 16 Note that all Halide code at Google has defaulted to having disable_llvm_loop_opt set for ~years now, so this is a well-tested codepath, and consensus on the Issue seemed to be that this was a good move. 10 May 2022, 20:58:26 UTC
a2e89d8 Add GeneratorFactoryProvider to generate_filter_main() (#6755) * Add GeneratorFactoryProvider to generate_filter_main() This provides hooks to allow overriding the Generator(s) that generate_filter_main() can use; normally it defaults to the global registry of C++ Generators, but this allows for (e.g.) alternate-language-bindings to selectively override this (e.g. to enumerate only Generators that are visible in that language, etc). (No visible change in behavior from this PR; this is just cherry-picked from work-in-progress elsewhere to simplify review & merge) * Update Generator.cpp * Fix error handling 10 May 2022, 01:07:57 UTC
a986078 Deprecate GeneratorContext getters with `get_` prefix (#6753) Minor hygiene: most getters in Halide don't have a `get_` prefix. These are very rarely used (only one instance in our test suite I could find) but, hey, cleanliness. 09 May 2022, 21:38:41 UTC
47d8103 Add a `HalideError` base class to Python bindings (#6750) * Add a `HalideError` base class to Python bindings Per suggestion from @alexreinking, this remaps all exceptions thrown by the Halide Python bindings to be `halide.HalideError` (or a subclass thereof), rather than plain old `RuntimeError`. * Remove scalpel left in patient * Don't use a subclass for PyStub error handling 06 May 2022, 00:49:40 UTC
6fbf203 Update hannk README link to hosted models page (#6749) The current one is being sunsetted 05 May 2022, 16:06:37 UTC
557690e Update WABT to 1.0.29 (#6748) 05 May 2022, 16:06:21 UTC
c8531a5 Silence "may be used uninitialized" in Buffer::for_each_element() (#6747) In at least one version of GCC (Debian 11.2.0-16+build1), an optimized build using `Buffer::for_each_element(int *pos)` will give (incorrect) compiler warnings/errors that "pos may be used uninitialized). From inspection of the code I feel pretty sure this is a false positive -- i.e., the optimizer is confused -- and since no other compiler we've encountered issues a similar warning (nor do we see actual misbehavior), I'm inclined not to worry -- but the warning does break some build configurations. Rather than try to fight with selectively disabling this warning, I'm going to propose inserting a memset() here to reassure the compiler that the memory really is initialized; while it's unnnecessary, it's likely to be insignificant compared to the cost of usual calls to for_each_element(). (BTW, this is not a new issue, I've seen it for quite a while as this GCC is the default on one of my Linux machines... it just finally annoyed me enough to want to make it shut up.) 05 May 2022, 01:17:40 UTC
1606039 Revise PyStub calling convention for GeneratorParams (#6742) This is a rethink of https://github.com/halide/Halide/pull/6661, trying to make it saner in anticipation of the ongoing Python Generator work. TL;DR: instead of mixing GeneratorParams in with the rest of the keywords, segregate them into an optional `generator_params` keyword argument, which is a plain Python dict. This neatly solves a couple of problems: - synthetic params with funky names aren't a problem anymore. - error reporting is simpler because before an unknown keyword could have been intended to be a GP or an Input. - GP values are now clear and distinct from Inputs, which is IMHO a good thing. This is technically a breaking change, but I doubt anyone will notice; this is mainly here to get a sane convention in place for use with Python Generators as well. Also, a drive-by change to Func::output_types() to fix the assertion error message. 04 May 2022, 00:16:26 UTC
92dfb61 Add __pycache__ to toplevel .gitignore file (#6743) 02 May 2022, 18:14:18 UTC
f376cbb Silence "unscheduled update stage" warnings in msan_generator.cpp (#6740) 30 April 2022, 17:20:46 UTC
e6260a8 Add forwarding for the recently-added Func::output_type() method (#6741) 30 April 2022, 17:20:22 UTC
41b2d07 Fix regression from #6734 (#6739) That change inadvertently required the RHS of an update stage that used `+=` (or similar operators) to match the LHS type, which should be required (implicit casting of the RHS is expected). Restructured to remove this, but still ensure that auto-injection of a pure definition matches the required types (if any), and updated tests. 28 April 2022, 20:53:19 UTC
fc0f4ed Add missing #include <functional> in ThreadPool.h (#6738) * Add missing #include <function> in ThreadPool.h * Update ThreadPool.h 28 April 2022, 18:31:38 UTC
00f4b29 More typed-Func work (#6735) - Allow Func output_type(s)(), outputs(), dimensions(), and output_buffer(s)() to be called on undefined Funcs if the Func has required_type and required_dimensions specified. This allows for greater flexibility in defining pipelines in which you may want to set or examine constraints on a Func that hasn't been defined yet; previously this required restructuring code or other awkwardness. - Ensure that the Funcs that are defined for ImageParams and Generator fields define the types-and-dims when known. - Add some tests. 28 April 2022, 01:33:49 UTC
799c546 Augment Halide::Func to allow for constraining Type and Dimensionality (#6734) This enhances Func by allowing you to (optionally) constrain the type(s) of Exprs that the Func can contain, and/or the dimensionality of the Func. (Attempting to violate either of these will assert-fail.) There are a few goals here: - Enhanced code readability; in cases where a Func's values may not be obvious from the code flow, this can allow an in-code way of declaring it (rather than via comments) - Enhanced type enforcement; specifying constraints allows us to fail in type-mismatched compilations somewhat sooner, with somewhat better error messages. - Better symmetry for AOT/JIT code generation with ImageParam, in which the inputs (ImageParam) have a way to specify the required concrete type, but the outputs (Funcs) don't. If this is accepted, then subsequent changes will likely add uses where it makes sense (e.g., the Func associated with an ImageParam should always have both type and dimensionality specified since it will always be well-known). Note that this doesn't add any C++ template class for static declarations (e.g. `FuncT<float, 2>` -> `Func(Float(32), 2)`); these could be added later if desired. 27 April 2022, 20:48:27 UTC
86a4a59 Remove `rounding_halving_sub` and non-existent arm rhsub instructions (#6723) * remove arm (s | u)rhsub instructions * remove rounding_halving_sub intrinsic entirely 26 April 2022, 19:14:34 UTC
f5c77ce Deprecate variadic-template version of Realization ctor (#6695) * Deprecate variadic-template version of Realization ctor The variadic-template approach was useful before C++11 (!) added brace initialization, but preferring an explicit vector-of-Buffer is arguably better, and provides better symmetry with the Python bindings. Also, some drive-by tweaks to other Realization methods. * Update PyPipeline.cpp * trigger buildbots 25 April 2022, 17:00:10 UTC
85b9f29 Grab-bag of minor Python fixes (#6725) 21 April 2022, 20:02:00 UTC
754018b Add Func::output_type() method (#6724) * Add Func::output_type() method * Add Python 21 April 2022, 19:58:49 UTC
aa384af `get_amd_processor()`: implement detection for the rest of supported AMD CPU's (#6711) I have *not* personally tested that these are detected correctly, Cross-reference between * https://github.com/llvm/llvm-project/blob/955cff803e081640e149fed0742f57ae1b84db7d/llvm/lib/Support/Host.cpp#L968-L1041 * https://github.com/llvm/llvm-project/blob/955cff803e081640e149fed0742f57ae1b84db7d/compiler-rt/lib/builtins/cpu_model.c#L520-L586 * https://github.com/gcc-mirror/gcc/blob/000c1b89d259fadb466e1f2e63c79da45fd17372/gcc/common/config/i386/cpuinfo.h#L111-L264 21 April 2022, 16:32:56 UTC
accc644 Remove legacy::FunctionPassManager usage in Codegen_PTX_Dev (#6722) LLVM devs indicate that none of the passes in this usage actually do anything and it can be safely removed. 21 April 2022, 03:01:15 UTC
3b3e89e Smarten type_of<> for fn ptrs; fix async_parallel for C backend (#6719) * Smarten type_of<> for fn ptrs; fix async_parallel for C backend (Fixes #2093) This basically just adds the right type annotations to make the parallel code produced by the C backend compile properly. This could have been fixed by inserted some brute-force void* casting into the C backend, but this felt a lot cleaner. The one thing here I'm a little unsure about is how I extended the Type code to be able to handle function-pointer types correctly; it works but doesn't feel very elegant. * Update Makefile * Update LowerParallelTasks.cpp * FunctionTypedef 20 April 2022, 17:01:39 UTC
a07d3e4 Closure functions for parallel tasks should be internal, not external (#6720) Minor optimization. 20 April 2022, 16:50:30 UTC
460c77e Update CodeGen_PTX_Dev to use new PassManager (#6718) * Update CodeGen_PTX_Dev to use new PassManager This was still using the LegacyPassManager for optimization, which will be going away at some point. (Code changes by @alinas; I'm just opening this PR on her behalf) * Fixes after review 20 April 2022, 00:25:35 UTC
65ba16e Combine string constants in combine_strings() (#6717) * Combine string constants in combine_strings() This is a pretty trivial optimization, but when printing (or enabling `debug`), it cuts the number of `halide_string_to_string()` calls we generate by ~half. * Update IROperator.cpp 19 April 2022, 21:53:00 UTC
01ca823 ARM vst mangling needs to be conditional on opaque ptrs (#6716) The fixes from last week regarding mangling of arm vst intrinsics needs to be made conditional on whether the pointer is opaque or not; this will change based on whether `-D CLANG_ENABLE_OPAQUE_POINTERS=ON|OFF` is defined when LLVM is built, but should be sniffed via this API, according to my LLVM contact. 19 April 2022, 20:04:15 UTC
4df3c5d Remove the last remaining call to getPointerElementType() (#6715) * Remove the last remaining call to getPointerElementType() LLVM is moving to opaque pointers, we must have missed this one in previous work * ARM vst mangling needs to be conditional on opaque ptrs The fixes from last week regarding mangling of arm vst intrinsics needs to be made conditional on whether the pointer is opaque or not; this will change based on whether `-D CLANG_ENABLE_OPAQUE_POINTERS=ON|OFF` is defined when LLVM is built, but should be sniffed via this API, according to my LLVM contact. * Revert "ARM vst mangling needs to be conditional on opaque ptrs" This reverts commit 9901314ff75dd0bf651b23d09c1d1f5f07d49ffd. 19 April 2022, 19:54:12 UTC
60a909f Fix type-mangling for vst on arm32 for LLVM15 (#6705) 14 April 2022, 16:49:38 UTC
77f7f5e Python: make Func implicitly convertible to Stage (#6702) (#6704) This allows for `compute_with` and `rfactor` to work more seamlessly in Python. Also: - Move two compute_with() variant bindings from PyFunc and PyStage to PyScheduleMethods, as they are identical between the two - drive-by removal of redundant `py::implicitly_convertible<ImageParam, Func>();` call 13 April 2022, 21:31:17 UTC
87c0cc9 llvm no longer wants a type suffix on vst intrinsics (#6701) * llvm no longer wants a type suffix on vst intrinsics * Fix silly mistake * Change 64-bit only Co-authored-by: Andrew Adams <anadams@adobe.com> 12 April 2022, 23:38:56 UTC
3d7b977 Drop support for Matlab extensions (#6696) * Drop support for Matlab extensions Anecdotally, this hasn't been used in ~years, and the original author (@dsharletg) had suggested dropping it a while back. I'm going to propose we go ahead and drop it for Halide 15 and see who complains. * Fixes for top-of-tree LLVM * Update force_include_types.cpp * trigger buildbots * Update CodeGen_LLVM.cpp 12 April 2022, 23:33:02 UTC
4da8932 Remove deprecated JIT handler setters (#6699) 12 April 2022, 04:45:12 UTC
009d86f Remove deprecated versions of Func::prefetch() (#6698) 12 April 2022, 04:45:00 UTC
3944fb0 Faster `widening_mul(int16x, int16x) -> int32x` for x86 (AVX2 and SSE2) (#6677) * add widening_mul using vpmaddwd for AVX2 * add vpmaddwd/pmaddwd test * add widening_mul with pmaddwd for SSE2 12 April 2022, 02:41:52 UTC
08325a4 Fixes for top-of-tree LLVM (#6697) 11 April 2022, 23:36:16 UTC
f906eba Silence "unknown warning" in Clang 13 (#6693) Clang 13 removed the `return-std-move-in-c++11` warning entirely, so specifying it now warns that the warning is unknown. 11 April 2022, 16:47:39 UTC
d568469 Add `break` to avoid 'possible unintentional fallthru' warning (#6694) 11 April 2022, 16:40:27 UTC
54f3977 Always mark _ucon as 'unused' in Codegen_C (#6691) * Always mark _ucon as 'unused' in Codegen_C, even if asserts are enabled, since generated closure functions may not use it * halide_unused -> halide_maybe_unused * fix test_internal * More halide_unused -> halide_maybe_unused 08 April 2022, 23:25:34 UTC
887d340 Upgrade to clang-format 13 (#6689) Goal here: eliminate the need for a local version of llvm/clang-12, and don't stay too far behind the toolchain. As always, clang-format doesn't promise backwards compatibility, but the main differences in formatting are: - more regularization of spaces at the start of comments (I like this change) - minor difference of formatting of function-pointer-type declarations (not a fan of this, but I can't find a way to disable it and it's only really used in a handful of place in the Python bindings) 08 April 2022, 19:01:27 UTC
b5840f7 Drop support for LLVM12 (#6686) * Drop support for LLVM12 Halide 15 only needs to support LLVM13 and LLVM13. Drop all the special-casing for LLVM12. * Update packaging.yml * Update presubmit.yml * 13 * more * Update presubmit.yml * woo * Update presubmit.yml * Update run-clang-tidy.sh * Update run-clang-tidy.sh * Update .clang-tidy * Update .clang-tidy * wer * Update Random.cpp * wer * sdf * sdf * Update packaging.yml 08 April 2022, 00:59:14 UTC
e549be7 Remove deprecated `build()` support from Generators (#6684) This was deprecated in Halide 14; let's remove it entirely for Halide 15. 08 April 2022, 00:03:34 UTC
f64bd08 Remove deprecated `Halide::Output` type (#6685) It was deprecated (in favor of `OutputFileType` in Halide 14; let's remove it entirely for Halide 15. 07 April 2022, 23:59:49 UTC
fe96aaa Fix "set but not used" warnings/errors (#6683) * Fix "set but not used" warnings/errors Apparently XCode 13.3 has smarter warnings about unused code and emits warnings/errors for these, so let's clean them up. * Also fix missing `ssize_t` usage 07 April 2022, 21:30:08 UTC
1d1b556 Halide::Tools::save_image() should accept buffers with `const` types (#6679) 06 April 2022, 23:25:04 UTC
ad0408e Clean up Python extensions in python_bindings (#6670) * Remove the nobuild/partialbuildmethod tests from python_bindings/ They no longer serve a purpose and are redundant to other tests. * WIP * Update pystub.py * wip * wip * wip * Update TargetExportScript.cmake * Update PythonExtensionHelpers.cmake * PyExtensionGen didn't handle zero-dimensional buffers 06 April 2022, 17:52:14 UTC
12270a5 Bump development Halide version to 15.0.0 (#6678) * Bump development Halide version to 15.0.0 * trigger buildbots 06 April 2022, 16:57:39 UTC
72ad2e6 `-mtune=native` CPU autodetection for AMD Zen 3 CPU (#6648) * `-mtune=native` CPU autodetection for AMD Zen 3 CPU * Address review notes. * Fix MSVC build * Address review notes 06 April 2022, 16:13:45 UTC
9866df2 Fix ctors for Realization (#6675) For vector-of-Buffers, the ctor took a non-const ref to the argument, which was weird and nonsensical. Replaced with a const-ref version and and an rvalue-ref version; it turns out that literally *all* of the internal calls were able to use the latter, trivially saving some copies. 05 April 2022, 23:23:25 UTC
fdd6500 Future-proof 'processor` to `tune processor` (#6673) 05 April 2022, 20:51:31 UTC
43af5b6 Allow PyPipeline and PyFunc to realize() scalar buffers (#6674) 05 April 2022, 20:38:51 UTC
f56614e Fix GPU depredication/scalarization (#6669) * Scalarize predicated Loads * Cleanup * Fix gpu_vectorize scalarization for D3D12 * Fix OpenCL scalarization * Minor fixes * Formatting * Address review comments * Move Shuffle impl to CodeGen_GPU_C class * Extra space removal Co-authored-by: Shoaib Kamil <kamil@adobe.com> 01 April 2022, 14:27:50 UTC
40f895d `-mtune=`/`-mcpu=` support for x86 AMD CPU's (#6655) * `-mtune=`/`-mcpu=` support for x86 AMD CPU's * Move processor tune into it's own enum, out of features * clang-format * Target: make Processor more optional * Processor: add explanatory comments which CPU is what * Drop outdated changes * Make comments in Processor more readable / fix BtVer2 comment * Target: don't require passing Processor * Make processor more optional in the features string serialization/verification * Address review notes * Undo introduction of halide_target_processor_t * Fix year for btver2/jaguar 31 March 2022, 22:23:26 UTC
6b9ed2a Remove the nobuild/partialbuildmethod tests from python_bindings/ (#6668) They no longer serve a purpose and are redundant to other tests. 30 March 2022, 22:30:42 UTC
5d2abd3 Add ldscript code for Python extensions in CMake (#6665) * Add ldscript code for Python extensions in CMake We added ldscripts to the Makefile for Python extensions (to restrict exported symbols to just the PyInit_foo symbol), but neglected to do so for CMake. This corrects that. 29 March 2022, 23:33:42 UTC
8aba364 Allow `make test_apps` to work with ASAN (#6659) * Allow `make test_apps` to work with ASAN With asan or tsan in the target, there is a space in the OPTIMIZE var, so it needs to be quoted. * tickle buildbots * tickle buildbots * tickle buildbots 29 March 2022, 18:56:33 UTC
ed3f4a7 Add optional runtime H::R::Buffer access checks (#6660) * Add optional runtime H::R::Buffer access checks This adds some optional `assert()` checks to HRB's `operator()` and friends. They are only enabled if `HALIDE_RUNTIME_BUFFER_CHECK_INDICES=1` is defined at compile time. Also fixes errors found by enabling these assertions and running tests. * Update fast_pow.cpp * clang-format * tickle buildbots * tickle buildbots 29 March 2022, 00:10:49 UTC
c2bebe2 Python Bindings: fix Python `bool` -> `Expr` implicit conversion (#6657) * Python Bindings: fix Python `bool` -> `Expr` implicit conversion It was implicitly converting to a Halide `Int(32)` literal of value 1, but we want it to match Halide's boolean type of `UInt(1)` * Update basics.py * wip * tickle buildbots 29 March 2022, 00:10:37 UTC
14d89f3 Fix 'variable set but not used` warning/error (#6658) Yes, some compilers complain about this. 25 March 2022, 23:38:46 UTC
17b537c Timer based profiler (#6642) * Add support for timer interrupt based profiling, which is useful for bring up on embedded ("bare metal") systems that may not have a full OS with threads. * Update runtime_api file with new routines. * Turn locking back on in timer based profiling case as it can be used in multiprocessor situations. (Both as an option on systems where threads would be fine and on embedded systems which don't have time shared threads but cores are dedicated to the Halide threadpool.) * Add target flag for timer profiling and extend performance_profiler test to cover timer profiling. 22 March 2022, 19:04:53 UTC
650554a Eliminate some unnecessary clamping in ClampUnsafeAccesses (#6297) (#6654) * Eliminate some unnecessary clamping in ClampUnsafeAccesses (#6297) * Update ClampUnsafeAccesses.cpp * Update ClampUnsafeAccesses.cpp * Update ClampUnsafeAccesses.cpp 22 March 2022, 00:31:24 UTC
49db215 Fix apparent type in PR #6294 (#6653) 18 March 2022, 19:36:17 UTC
9ab3566 [CMake] Deduplicate `Halide_LLVM_VERSION` and `LLVM_PACKAGE_VERSION` (#6646) 16 March 2022, 04:56:40 UTC
b608583 Update initialization of WABT `store` field to work with top-of-tree (#6649) The copy and move assignment operators for Store are going away; initialize ours in the 15 March 2022, 22:07:46 UTC
07dddb7 Allow profiler feature under wasm iff wasm_threads is enabled (#6643) The profiler requires threads, but works fine when wasm_threads are enabled. 10 March 2022, 18:57:29 UTC
f6628aa Fix UB in hannk FillWithRandom operation. (#6645) A recent change to libc++ rejects std::uniform_int_distribution<uint8_t> explicitly (https://reviews.llvm.org/D114920). This change prevents this file to build with upcoming revisions of the C++ toolchain. 10 March 2022, 18:50:25 UTC
105f7e5 Fix const-correctness in C/C++ backend (Issue #6636) (#6638) in the Load handler, we need to emit the cast in the form `TYPE const *` rather than `const TYPE *`, as TYPE could be `void *`, and the const would bind in a way we don't want. 07 March 2022, 20:59:04 UTC
5f37d50 Convert most remaining Generators to prefer statically-dimensioned Inputs and Output where possible (#6641) This is the same as #6620, except that it omits autoschedulers/adams2019/cost_model_generator.cpp (which is unusually complex and not yet settled as to whether the changes are welcome). Basically an attempt to land the uncontroversial parts. 07 March 2022, 20:58:12 UTC
a55ae55 Clear bounds info on casts when value bounds are undefined (for overflow types) (#6640) 06 March 2022, 20:21:32 UTC
979e204 python_bindings: acquire GIL before printing (#6635) 06 March 2022, 00:25:10 UTC
0b8e263 Clean up python_binding Makefile (#6634) * Clean up python_binding Makefile Some of the tests with Generator dependencies had deps set up in a weird way that made some downstream work painful. Cleaned up. Also vastly reduced build-time noise. * Update Makefile * Add linker scripts for Python extensions 06 March 2022, 00:24:29 UTC
3827279 Python Bindings didn't allow for zero-D Funcs, ImageParams, Buffers (#6633) * Python Bindings didn't allow for zero-D Funcs, ImageParams, Buffers There were no overloads or tests for accessing the element of any of these in the zero-D case, and the obvious syntax (`[]`, to mirror C++ `()` in these cases) isn't legal in Python. To support this uncommon-but-necessary case, I'm proposing that we use the syntax `[None]`, which isn't pretty, but is less bad than other options I've considered so far. (Suggestions welcome.) * Use [()] instead of [None] 04 March 2022, 00:49:18 UTC
86728d7 Wild match object is not foldable (#6623) 26 February 2022, 18:56:09 UTC
caa3ad0 Avoid double narrowing in widening_add/widening_sub if type is 8-bit (#6629) 24 February 2022, 01:09:55 UTC
b2edbc9 Disallow `Type::narrow()` and `Type::widen()` from producing bitwidths between 1 and 8 bits (#6622) * Disallow Type::narrow() and Type::widen() from producing bitwidths between 1 and 8 bits * Narrowing a 1-bit type should error 22 February 2022, 20:09:23 UTC
a4ed033 Make IRComparer consider nans to be less than non-nans. (#6626) * Make IRComparer consider nans to be less than non-nans. Fixes #6624 * Fix test * Respond to reviewer comments. * Return -1 instead of +1 22 February 2022, 04:31:55 UTC
16bfa2f remove incorrect docs on widening_add (#6625) 21 February 2022, 18:42:15 UTC
be1269b Add Stage::unscheduled() 18 February 2022, 22:26:15 UTC
0786dd4 Fix atomics test 18 February 2022, 22:26:15 UTC
6424aaa Reenable warning about unscheduled update definitions and fix associated issues in the tests and apps. This is an old warning that stopped triggering because it wasn't tested. We should either remove it, fix the trigger conditions, or perhaps make it an error. This PR fixes the trigger conditions and fixes all instances of the warning in our tests and apps. The warning triggers if you schedule some but not all of the update definitions of a Func. It's to protect against the common error of only scheduling the pure definition of something like a summation. The warning can be suppressed by inserting a call to func.update(idx). 18 February 2022, 22:26:15 UTC
7373eb9 Move GeneratorContext into a standalone class (#6618) * Move GeneratorContext into a standalone class * Minor Fixes * clang-tidy * Update Generator.cpp * Update Generator.cpp 17 February 2022, 17:07:46 UTC
846592f Update WABT version to the just-released 1.027 (instead of main) (#6619) * Update WABT version to the just-released 1.027 (instead of main) * tickle buildbots 16 February 2022, 22:41:18 UTC
4ccd0ec Remove halide_config.cmake from Makefile build. Fixes #6615 (#6616) 16 February 2022, 01:25:09 UTC
f9189dc Update apps/hannk to use TFLite 2.8.0 (#6617) 16 February 2022, 01:24:57 UTC
3628b67 Minor Generator cleanup (#6613) 15 February 2022, 00:30:25 UTC
ba86c2e Unbreak WABT again by using main instead of a commit (#6614) 15 February 2022, 00:29:13 UTC
38032e8 Only commutative reductions can be parallelized (#6609) Because parallelization changes the order of computation within the reduction, parallelizing associative but non-commutative reductions can result in (non-deterministically) incorrect results in the same way `reorder`ing them can. For instance Halide currently accepts the following code, but generates non-deterministic outputs on GPU. On CPU with `.parallel(r.x)`, OpenMP rejects the generated code (correctly) stating that the `#pragma omp atomic` is invalid for the same reasons. ```c++ #include <stdio.h> #include "Halide.h" using namespace Halide; int main(int argc, char **argv) { Halide::Func A("A"), B("B"); Halide::Var i("i"); A(i) = i; B() = -1; Halide::RDom r(0, 1024); B() = A(r.x); A.compute_root(); B.update().atomic().gpu_blocks(r.x); B.compile_jit(get_host_target().with_feature(Target::CUDA)); Halide::Buffer<int32_t> b = B.realize(); printf("%d\n", b()); return 0; } ``` 14 February 2022, 18:43:50 UTC
39ed0c8 Merge branch 'rootjalex/improve_cbounds_fixed' of https://github.com/halide/Halide into rootjalex/test_cbounds_fixed 14 February 2022, 18:30:07 UTC
ac42774 fix stupid bug in find_constant_bound 14 February 2022, 18:29:37 UTC
4c356a3 Merge branch 'rootjalex/improve_cbounds_fixed' of https://github.com/halide/Halide into rootjalex/test_cbounds_fixed 14 February 2022, 16:49:20 UTC
1accd0a Merge branch 'master' of github.com:halide/Halide into rootjalex/improve_cbounds_fixed 14 February 2022, 16:48:46 UTC
ff75fff optimize for the singular direction bounds case 14 February 2022, 16:45:55 UTC
3a19053 only simplify if an approximate method changed the Expr 14 February 2022, 16:38:23 UTC
back to top