https://github.com/halide/Halide

sort by:
Revision Author Date Message Commit Date
36e3c5f Disable TARGET_SPIRV by default for now 29 July 2022, 18:19:31 UTC
85ebe2e Fix formatting for more single-line if statements 28 July 2022, 22:05:53 UTC
e733270 Merge branch 'main' into vulkan-phase1-spirv 28 July 2022, 22:01:25 UTC
b9cec32 Clang format/tidy pass 28 July 2022, 22:00:51 UTC
a22e905 Rename hash_* methods to make_*_key methods (since they construct a key and don't actually hash the value) Fix typo on components 28 July 2022, 21:57:48 UTC
9e1296b Add comment to SpirvIR.h header clarifying this file should not be exported. Fix formatting to avoid single line if statements. Use reserve for constructing vector components 28 July 2022, 21:49:08 UTC
146b7ac Refactor is_defined() asserts into check_defined() for reuse 28 July 2022, 21:23:47 UTC
e1b4947 Add comment about *not* including internally used headers like SpirvIR.h 28 July 2022, 21:23:15 UTC
b9a3356 Remove (most) of the env var usage from Adams2019 (#6861) * Move ASLog.cpp/.h to common/ * Add trivial Parsing utility & use it * Update ParamParser.h * fixes * wip * fixes * Fixes * clang-format * Update Makefile * Remove may_subtile * Update Cache.cpp * Update Cache.cpp * Update AutoSchedule.cpp * Update AutoSchedule.cpp 27 July 2022, 23:32:14 UTC
3859b36 Add support for generating x86 sum-of-absolute-difference reductions (#6872) 27 July 2022, 22:12:36 UTC
4ea273b Merge branch 'main' into vulkan-phase1-spirv 27 July 2022, 21:17:29 UTC
f31a72c Add ./dependencies/spirv to clang format ignore file 27 July 2022, 21:08:58 UTC
e35ca5c Remove SpirvIR.h header file from being included with Halide.h (since it's only used internally for CodeGen) 27 July 2022, 21:04:00 UTC
ec7f27c Make SPIR-V include path a system path to avoid clang format/tidy processing 27 July 2022, 21:00:17 UTC
ae01eb1 Add local copy of SPIR-V header file, along with license and readme. Update CMake rules to use local include path by default. 27 July 2022, 20:55:12 UTC
c8b811a Fixes to allow compiling with LLVM16 (#6889) 27 July 2022, 20:20:16 UTC
e3e169d Rewrite PythonExtensionGen to be C++ based (#6888) * Rewrite PythonExtensionGen to be C++ based This is intended as an alternative to #6885 -- this is even *more* gratuitous, but: - We have ~always compiled Python extensions using C++ anyway - This code is arguably terser, cleaner, and safer (the cleanups happen via dtors) - The code size difference is negligible (~300 bytes out of 160k for addconstant.cpython-39-darwin.so) * Update PythonExtensionGen.cpp 27 July 2022, 16:21:44 UTC
2794bd1 Don't use imported interface for SPIR-V. Use Halide_SPIRV naming since target is defined before Halide itself. 26 July 2022, 23:58:00 UTC
842dbd7 Revert back to Halide_SPIRV target name 26 July 2022, 23:33:28 UTC
9d42283 Fix path finding logic for SPIR-V header path from populated fetch dependency 26 July 2022, 23:08:35 UTC
49f065a Turn on FETCH_SPIRV_HEADERS by default to get build to pass for now 26 July 2022, 22:57:26 UTC
93e6df2 Fix declaration ordering for TARGET_SPIRV option so that dependencies get triggered 26 July 2022, 22:43:35 UTC
c4a1602 Add missing iostream header when WITH_SPIRV is undefined 26 July 2022, 22:20:00 UTC
a4fd2af Merge branch 'main' into vulkan-phase1-spirv 26 July 2022, 22:07:40 UTC
6490b3e Update src/CMakeLists.txt Co-authored-by: Alex Reinking <reinking@google.com> 26 July 2022, 21:52:22 UTC
b877573 Fixes and cleanups to address PR #6882 Refactor logic of SPIR-V dependency to make fetch dependecy optional Change SPIR-V fetch dependency to avoid building and just populate contents Change SPIR-V internal test to always link against method ... only enabled if WITH_SPIRV is defined Add missing SPIRV target feature 26 July 2022, 21:51:05 UTC
7821212 Add set-host-dirty/copy-to-host to PythonExtensionGen (#6869) * Add set-host-dirty/copy-to-host to PythonExtensionGen See https://github.com/halide/Halide/issues/6868: Python Buffers are host-memory-only, so if the AOT-compiled halide code runs on (say) GPU, it may fail to copy the inputs to device and/or the results back to host. This fixes that. (We still need a solution that allows for lazy copies, but that will require adding another protocol that supports it.) * Update PythonExtensionGen.cpp 25 July 2022, 16:32:24 UTC
5e69ad9 [Codegen_LLVM] Define all the things (#6866) Long-term plan for LLVM is to get rid of `undef`, and replace it with zero-initialization, err, `poison`, because it has nicer semantics. Everywhere we use `undef` as a placeholder in shuffle (be it either for a second operand, or undef shuffle mask element), or as a base 'empty' vector we are about to fully override via insertelement, we can just switch those to poison nowadays. The scary part is the `Call::undef` semantics/lowering, perhaps it will need to be `freeze poison`. 25 July 2022, 16:24:56 UTC
11a049c #6863 - Fixes to make address sanitizer happy for internal runtime classes (#6880) * Fixes to make address sanitizer happy. Fixed initialization defects in StringStorage that could cause buffer overruns Fixed memory leaks within RegionAllocator and BlockAllocator Added system memory allocation tracking to all internal runtime tests. * Clang Tidy / Format pass * Fix formatting to use braces around if statements Co-authored-by: Derek Gerstmann <dgerstmann@adobe.com> 22 July 2022, 22:20:47 UTC
7e06694 Hookup internal SPIRV IR test 22 July 2022, 22:07:42 UTC
4770495 Ensure $CMAKE_{lang}_OUTPUT_EXTENSION is set before using it (#6879) Ensure CMAKE_{lang}_OUTPUT_EXTENSION is set before using it Co-authored-by: Shoaib Kamil <kamil@adobe.com> 22 July 2022, 14:36:50 UTC
c904c53 Refactor/cleanup in Autoscheduler code (#6858) * Move ASLog.cpp/.h to common/ * Add trivial Parsing utility & use it * Update ParamParser.h * fixes * fixes 21 July 2022, 21:24:51 UTC
06fcf94 Fix error in Makefile for Adams2019 on OSX (#6877) We erroneously link in the dylib and also dynamically load it, causing an error. We should skip the linkage and always load dynamically.. 21 July 2022, 19:14:39 UTC
04c465b [Codegen] Fail to codegen `Call::undef`, just like `Call::signed_integer_overflow` (#6871) See discussion in https://github.com/halide/Halide/pull/6866. It's not obvious if that codepath is ever hit, let's optimistically assume that it is not. If this turns out to be not true, we'll have to deal with a more complicated question of the proper lowering for it, can it be `poison`, or must it be a `freeze poison`. 21 July 2022, 19:11:22 UTC
8b5486b [Codegen_LLVM] Radically simplify `visit(const Reinterpret *op)` (#6865) 1. LLVM IR `bitcast` happily bitcasts between vectors and scalars: https://godbolt.org/z/9zqx11rna 2. `ptrtoint` already implicitly truncates/zero-extends if the int is larger than the pointer type: https://llvm.org/docs/LangRef.html#ptrtoint-to-instruction 3. `inttoptr` already implicitly truncates/zero-extends if the int is larger than the pointer type: https://llvm.org/docs/LangRef.html#inttoptr-to-instruction So we don't need to do any of that 'special' handling. 21 July 2022, 16:35:40 UTC
9a94756 Use pmaddubsw 8-bit horizontal widening adds (Fixes #6859) (#6873) * use pmaddubsw 8-bit horizontal widening adds * add SSE3 versions too * add pmaddubsw tests 21 July 2022, 15:01:16 UTC
967c3bf Fix simd_op_check for top-of-tree LLVM (#6874) * Fix simd_op_check for top-of-tree LLVM * clang-format 20 July 2022, 23:57:27 UTC
51c06b7 Python source reorg (#6867) * Move python binding sources to src/halide/halide_ * Rename native module to halide_ * Fix tests * Avoid copying Python sources * Fix installation rules * Make diff smaller * trigger buildbots * Add issue todo Co-authored-by: Steven Johnson <srj@google.com> 20 July 2022, 22:35:45 UTC
359026a Promote Reinterpret Intrinsic into an Reinterpret IR Node (#6853) * Promote Reinterpret Intrinsic into an Reinterpret IR Node As discussed in https://github.com/halide/Halide/issues/6801#issuecomment-1152731683 I don't think this is complete, there are likely a few more places that need to be taught about it still, altough i think this is mostly it. Note that this only promotes the intrinsic, this does not adjust it's handling, as hinted in: https://github.com/halide/Halide/issues/6801#issuecomment-1155603752 * Silence buildbot warning * Speculative fix for Codegen C failure? * Restore comment * Delete obsolete FIXME * RegionCost: reinterpret is free * LICM: actually adjust the comment 20 July 2022, 00:19:44 UTC
2d907c4 [vulkan phase0] Add adts for containers and memory allocation to runtime (#6829) * Cherry pick runtime internals as standalone commit (preparation work for Vulkan runtime) * Clang format/tidy fixes * Fix runtime test linkage and include paths to not include libHalide * Update test/runtime/CMakeLists.txt Fix typo mismatch for HALIDE_VERSION_PATCH Co-authored-by: Alex Reinking <reinking@google.com> * Add compiler id guard to build options for runtime tests * Avoid building runtime tests on MSVC since Halide runtime headers are not MS compatible Remove CLANG warning flag for runtime test * Change runtime test compile definitions to be PRIVATE. Remove PUBLIC_EXPORTS from runtime test definition. * Add comment about GNU warnings for 'no-builtin-declaration-mismatch' * Change to debug(user_context) for debug messages where context is valid. Wrap verbose debugging with DEBUG_RUNTIME ifdef. Syle pass based on review comments. * Add note explaining why we disable the internal runtime tests on MSVC. * Cleanup cmake logic for disabling runtime internal tests for MSVC and add a status message. * Don't use strncpy for prepend since some implementations may insert a null char regardless of the length used * Workaround varying platform str implementations and handle termination directly. * Clang Tidy/Format pass Co-authored-by: Derek Gerstmann <dgerstmann@adobe.com> Co-authored-by: Alex Reinking <reinking@google.com> 15 July 2022, 22:15:18 UTC
b1ca334 Rework autoscheduler API (#6788) (#6838) * Rework autoschduler API (#6788) * Oops * Update test_function_dag.cpp * clang-tidy * trigger buildbots * Update Generator.h * Minor cleanups * Update README_cmake.md * Check for malformed autoscheduler_params dicts * Add alias-with-autoscheduler code, plus tweaks * Update stubtest_jittest.cpp * Update Makefile * trigger buildbots * fixes * Update AbstractGenerator.cpp * Update stubtest_generator.cpp * Update Makefile * Add deprecation warning for HALIDE_ALLOW_LEGACY_AUTOSCHEDULER_API * Make AutoschedulerParams a real struct * clang-tidy 15 July 2022, 22:13:50 UTC
fad2f73 Refactory SPIR-V factory methods. Fix SPIR-V interface library and header paths. Add SPIR-V internal test. 15 July 2022, 22:02:33 UTC
24913eb Silence Adams2019 Autoscheduler (#6854) * Make aslog() a proper ostream * Ensure that all `dump()` calls take and use an ostream * Progress Bar only draws as LogLevel >= 1 * clang-format * Rework all aslog(0) statements * Update ASLog.cpp * syntax * Update ASLog.cpp * Revert fancy aslog stuff * Update ASLog.h * trigger buildbots 15 July 2022, 17:55:43 UTC
f9c2cdf Add autoscheduling to the generator_aot_stubuser test (#6855) * Add autoscheduling to the generator_aot_stubuser test * fix test_apps * fix test_apps, again 14 July 2022, 23:26:53 UTC
0db87dd Refactor SPIR-V IR into separate header / source files. 14 July 2022, 21:41:00 UTC
e414f19 Import SPIRV-IR from personal branch 14 July 2022, 19:22:13 UTC
bdd7114 Fix the PLUGINS argument to properly join multiple arguments (#6851) 13 July 2022, 17:13:11 UTC
13a43c0 Add placeholder code for bfloat16 in Python (#6849) (#6850) * Add placeholder code for bfloat16 in Python (#6849) This is a no-op change; I just want to mark the place(s) in the Python bindings that need attention if/when it becomes possible to support bfloat16 in Python buffers. * Update PyBinaryOperators.h 12 July 2022, 23:08:52 UTC
708a320 Deprecate/remove Generator::get_externs_map() and friends (#6844) * Deprecate/remove Generator::get_externs_map() and friends This is a feature of Generator that was added years ago to allow adding external code libraries in LLVM bitcode form (rather than simply as extern "C" or similar). In theory it allow for better codegen for external code modules (since LLVM has access to all the bitcode for optimization); in practice, we only know of one project that ever used it, and that project no longer exists. Additionally, it tended to be fairly flaky in terms of actual use -- e.g., missing symbols tended to crop up unpredictably. The issues with this feature are likely fixable, but since it hasn't (AFAICT) been used in ~years, we're better off deprecating it for Halide 15 and removing for Halide 16. (If anyone out there is still relying on this feature, obviously you should speak up ASAP.) * Also remove ExternalCode.h & friends * Also remove correctness/external_code.cpp * HALIDE_ALLOW_GENERATOR_EXTERNS_MAP -> HALIDE_ALLOW_GENERATOR_EXTERNAL_CODE 11 July 2022, 23:27:49 UTC
d266e4e Remove Generator::value_tracker and friends (#6845) This is an internal-to-Generator helper that is used to try to detect certain classes of errors when using GeneratorStubs. To the best of my knowledge, it has ~never found a useful error in all of its existence; combined with the very limited usage of GeneratorStubs, I think this code no longer pays for itself, and should be removed. (Note that this was never externally visible, thus no deprecation warnings should be necessary.) 11 July 2022, 22:07:54 UTC
8159dd3 Check RDom::where predicates for race conditions (#6842) Fixes #6808 11 July 2022, 18:39:14 UTC
29ebde9 Better lowering of halving_sub and rounding_halving_add (#6827) * Better lowering of halving_sub and rounding_halving_add Previously, lower_halving_sub and lower_rounding_halving_add both used 9 ops. This change redirects halving_sub to use rounding_halving_add, and redirects rounding_halving_add to use halving_add. In the case that none of these instructions exist natively, this reduces it to 7/8 ops for signed/unsigned halving sub and 6 ops for rounding halving add. More importantly, this lets halving_sub make use of pavgw/b on x86 to reduce it to 3 ops for u8 and u16 inputs. * Make signed rounding_halving_add on x86 use pavgb/w too * Cast result back to signed * Add explanatory comment * Fix comment * Add explanation of signed case 11 July 2022, 18:03:55 UTC
23c4cf1 Rearrange subdirectories in python_bindings (#6835) This is intended to facilitate a few things: - Move all Generators used in tests, apps, etc to a single directory to simplify the build rules (this is especially useful for the work in https://github.com/halide/Halide/pull/6764) - Put all the test and apps stuff under a single directory to facilitate adding some Python packaging that can make integration into Bazel/Blaze builds a bit less painful @alexreinking, does this look like the layout we discussed before? 01 July 2022, 17:10:34 UTC
23a1fa8 Disable testing for apps/linear_algebra on x86-32-linux/Make (#6836) * Disable testing for apps/linear_algebra on x86-32-linux/Make This wasn't biting us before because we were disabling *all* apps/ on x86-32-linux (oops); the recent change to remove python testing under Make also re-enabled this test. TL;DR: this can probably be made to work somehow, but it's not worth debugging, since that case is both pretty nice, and already covered under CMake. It's literally not worth the time to fix. * Update Makefile 01 July 2022, 03:48:04 UTC
6838db0 Remove unused function in callable_generator.cpp (#6834) 30 June 2022, 23:11:21 UTC
b2771c1 Scrub Python from Makefile after buildbot update (#6833) 30 June 2022, 21:26:13 UTC
fac313e Add a new, alternate JIT-call convention (#6777) * Prototype of revised JIT-call convention Experiment to try out a way to call JIT code in C++ using the same calling conventions as AOT code. Very much experimental. * Update Pipeline.h * Add Python support for `compile_to_callable` + make empty_ucon static * Update PyCallable.cpp * Update buffer.py * wip * Update callable.py * WIP * Update custom_allocator.cpp * Update Callable.cpp * Add Generator support for Callables * Update Generator.cpp * Update PyPipeline.cpp * Fixes * Update callable.cpp * Update CMakeLists.txt * create_callable_from_generator * More cleanup * Update Generator.cpp * Fix Python bounds inference * Add Python wrapper for create_callable_from_generator() + Add kwarg support for Callable * Add set_generatorparam_values() + usage * Fix auto_schedule/machine_params parsing The recent refactoring that added `execute_generator` accidentally nuked setting these two GeneratorParams. Oops. Fixed. * Move the type-checking code into a constexpr code * Update Callable.h * clang-tidy * CLANG-TIDY * Add `make_std_function`, + more general cleanup * Update example_jittest.cpp * Update Callable.h * Update Callable.h * More tweaking, smaller CallCheckInfo * Still more cleanup * make_std_function now does Buffer type/dim checking where possible * Add tests for calling `AbstractGenreator::compile_to_callable()` directly * enable exports * Various fixes * Improve fill_slot for Halide::Buffer * kill report_if_error * Update callable_bad_arguments.cpp * Update Pipeline.cpp * Revise error handling * Update Callable.cpp * Update callable.py * Update callable_generator.cpp * Update callable.py * HALIDE_MUST_USE_RESULT -> HALIDE_FUNCTION_ATTRS for Callable 30 June 2022, 20:18:42 UTC
60d2b98 Remove Python bindings from Makefiles (#6821) * Remove Python bindings from Makefiles * Restore test_li2018 in Makefile (now C++-only) * Add dummy `test_python` target for buildbots 30 June 2022, 18:23:48 UTC
ece5fb7 Apply CMAKE_C_COMPILER_LAUNCHER to initmod clang calls (#6831) 30 June 2022, 15:55:12 UTC
d36cd04 Change stub module names in Python to be _pystub rather than _stub (#6830) This is a bit finicky, but making this the default nomenclature will make some downstream usages less ambiguous and a bit easier to manage. (Yes, I realize that #6821 removes the Makefile entirely, but until it lands, it needs fixing there too.) 29 June 2022, 23:15:07 UTC
3e142cf Tweak python apps for better Blaze/Bazel compatibility (#6823) * Tweak python apps/tutorials for better Blaze/Bazel compatibility - Don't write to current directory (rely on an env var to say where to write) - Don't read from arbitrary absolute paths (again, rely on an env var) - Drive-by removal of unnecessary #include in Codegen_LLVM.cpp inside a lambda (!) * Recommended fixes * Revert all changes to tutorial * Revise apps * Remove apps_helpers.py 28 June 2022, 20:55:16 UTC
c12f8a5 Fix for top-of-tree LLVM (#6825) 28 June 2022, 16:52:25 UTC
e0a9825 Update presets to format version 3 (#6824) 28 June 2022, 15:46:06 UTC
feba77c Rework .gitignore (#6822) * reorganize .gitignore * Add exclusions for CMake build * .gitignore: comment, drop stale rules * fully and precisely exclude CMake build tree * add debugging directions to .gitignore * ignore CMake install tree * Sort groups 28 June 2022, 15:39:48 UTC
9e5c5ce Add support for vscale vector code generation. (#6802) Add support for vscale vector code generation. Factored from the fixed_length_vectors branch to make PRs smaller and easier to review. This will be used to support the ARM SVE/SVE2 and RISC V Vector architectures. 27 June 2022, 23:08:12 UTC
0e17e67 [CMake] Mark multi-threaded tests as such (#6810) 27 June 2022, 18:48:05 UTC
fc0f1f7 Fix two minor bugs triggered by an or reduction with early-out (#6807) * Fix two minor bugs triggered by a or reduction with early-out * Gotta print success * Appease clang-tidy 14 June 2022, 22:05:08 UTC
ce75862 Rewrite strided loads of 4 in AlignLoads (#6806) * Rewrite strided loads of 4 in AlignLoads * Add a check for strided 4 load 14 June 2022, 18:08:36 UTC
0ec2740 Fix auto_schedule/machine_params parsing (#6804) The recent refactoring that added `execute_generator` accidentally nuked setting these two GeneratorParams. Oops. Fixed. 06 June 2022, 21:07:18 UTC
f712f4f Minor typedef cleanup (#6800) * cleanup * format 06 June 2022, 16:42:30 UTC
8b31327 Make all tests default to `-fvisibility=hidden` (#6799) * Step 1 * still more * Export the error classes so they can be caught 02 June 2022, 16:44:26 UTC
00b5728 Silence a "possibly uninitialized" warning (#6797) * Silence a "possibly uninitialized" warning At least one compiler thinks we can use this without initialization, which isn't true, but this silences it. * trigger buildbots 02 June 2022, 16:43:40 UTC
e832c4f Pacify clang-tidy (#6796) * Pacify clang-tidy Newer versions can warn about "parameter 'f' shadows member inherited from type 'StubOutputBufferBase'", etc -- easy enough * Update .clang-tidy 01 June 2022, 21:13:24 UTC
4f2251c Add missing include to test_sharding.h (#6795) 01 June 2022, 19:46:03 UTC
2b29bde slow tests should support sharding (#6780) * slow tests should support sharding The simd_op_check test suite is pretty slow (especially for wasm, where it is interpreted); at one point we tried to use ThreadPool to speed it up, but too many pieces of Halide IR aren't threadsafe and we disabled it long ago. This removes the ThreadPool usage entirely, and instead adds support for the GoogleTest 'sharded test' protocol, which uses certain env vars to allow a test to opt in for splitting its test into smaller pieces. At present our buildbot isn't attempting to make use of this feature, but it will be a big win for downstream usage in Google, where tests that run "too long" are problematic and splitting them into multiple shards makes various day to day activiites much more pleasant. 01 June 2022, 19:07:23 UTC
76793b4 Add Target support for architectures with implementation specific vector size. (#6786) Move vector_bits_* Target support from fixed_width_vectors branch to make smaller PRs. 31 May 2022, 23:37:08 UTC
255ff18 hexagon_scatter test should run only if target has HVX (#6793) It will run otherwise, but is slow on some other targets; rather than trying to (e.g.) shard it, just skip it 31 May 2022, 21:32:40 UTC
74d9909 Define an AbstractGenerator interface (#6637) * AbstractGenerator (rebased, v3) * Update AbstractGenerator.h * clang-format * Update Generator.cpp * IOKind -> ArgInfoKind * Various cleanups of AbstractGenerator * clang-format * fix pystub * Update abstractgeneratortest_generator.cpp * dead code * ArgInfoDirection * cleanup * Delete PyGenerator.cpp * Update PyStubImpl.cpp * Update PyStubImpl.cpp * Fixes from review comments * Remove `get_` prefix from getters in AbstractGenerator * Missed some fixes * Fixes * Add GeneratorFactoryProvider for generate_filter_main() * Add GeneratorFactoryProvider to generate_filter_main() This provides hooks to allow overriding the Generator(s) that generate_filter_main() can use; normally it defaults to the global registry of C++ Generators, but this allows for (e.g.) alternate-language-bindings to selectively override this (e.g. to enumerate only Generators that are visible in that language, etc). (No visible change in behavior from this PR; this is just cherry-picked from work-in-progress elsewhere to simplify review & merge) * Update Generator.cpp * fixes * Update Generator.cpp * Restore build_module() and build_gradient_module() methods * Update Generator.h * fixes * Update Generator.cpp * Update AbstractGenerator.h 31 May 2022, 18:29:23 UTC
25f615d halide_type_of<>() should always be constexpr (#6790) The ones in HalideRuntime.h have been marked constexpr for a while, but the ones in Float16.h got missed 31 May 2022, 17:15:41 UTC
3ba2f94 LLVM codegen: register AA pipeline if LLVM is older than 14 (#6785) It's the default after https://reviews.llvm.org/D113210 / https://github.com/llvm/llvm-project/commit/13317286f8298eb3bafa9ddebd1c03bef4918948, but still needs to be done for earlier LLVM's. Refs. https://github.com/halide/Halide/issues/6783 Refs. https://github.com/halide/Halide/pull/6718 Partially reverts https://github.com/halide/Halide/pull/6718 27 May 2022, 19:35:32 UTC
0f7d548 Move some options from execute_generator back to generate_filter_main (#6787) Loading plugins and setting the default autoscheduler name both change global state, which isn't a desirable fit for execute_generator(), since it's not intended to mutate global state. (Mutating the state from a main() function is of course a reasonable thing to do.) 27 May 2022, 17:18:23 UTC
d0c53fa [miscompile] Don't de-negate and change direction of shifts-by-unsigned (#6782) I'm afraid the problem is really obvious: https://github.com/halide/Halide/blob/b5f024fa83b6f1cfe5e83a459c9378b7c5bf096d/src/CodeGen_LLVM.cpp#L2628-L2649 ^ the shift direction is treated as flippable by the codegen iff the shift amount is signed :) The newly-added test fails without the fix. I've hit this when writing tests for https://github.com/halide/Halide/pull/6775 26 May 2022, 17:33:31 UTC
ad1e7f6 Convert some assert-only usage of output_types() -> types() (#6779) 24 May 2022, 18:31:02 UTC
83a90e7 Allow overriding of `Generator::init_from_context()` for debug purposes (#6760) * Allow overriding of `Generator::init_from_context()` for debug purposes * Update Generator.h * Attempt to clarify contract 23 May 2022, 18:31:19 UTC
d973993 Add execute_generator() API (#6771) This refactors the existing `generate_filter_main()` call in two, moving the interesting implementation of how to drive AOT into the new `execute_generator()` call (reducing `generate_filter_main()` to parsing argc/argv and error reporting). The new `execute_generator()` is intended to be used (eventually) from Python, as a way to drive Generator compilation from a Python script more easily. The PR doesn't provide a Python wrapper for this call yet (that will come in a subsequent PR). Also, a drive-by removal of the "error_output" arg to generate_filter_main() -- AFAICT, no one has ever used it for anything but stderr, and the refactoring now just directs all errors to `user_error` uniformly. 23 May 2022, 18:31:03 UTC
56acc6e Fix annoying typo in Func.h (#6774) Update Func.h 19 May 2022, 21:22:46 UTC
b5f024f Fix fundamental confusion about target/tune CPU (#6765) * Fix fundamental confusion about target/tune CPU Sooo. Uh, remember when in https://github.com/halide/Halide/pull/6655 we've agreed that we want to add support to precisely specify the CPU for which the code should be *tuned* for, but not *targeted* for. Aka, similar to clang's `-mtune=` option, that does not affect the ISA set selection? So guess what, that's not what we did, apparently. `CodeGen_LLVM::mcpu()` / `halide_mcpu` actually do specify the *target* CPU. It was obvious in retrospect, because e.g. `CodeGen_X86::mattrs()` does not, in fact, ever specify `+avx2`, yet we get AVX2 :) So we've unintentionally added `-march=` support. Oops. While i'd like to add `-march=` support, that was not the goal here. Fixing this is complicated by the fact that `llvm::Target::createTargetMachine()` only takes `CPU Target` string, you can't specify `CPU Tune`. But this is actually a blessing in disguise, because it allows us to fix another bug at the same time: There is a problem with halide "compile to llvm ir assembly", a lot of information from Halide Target is not //really// lowered into LLVM Module, but is embedded as a metadata, that is then extracted by halide `make_target_machine()`. While that is not a problem in itself, it makes it *impossible* to dump the LLVM IR, and manually play with it, because e.g. the CPU [Target] and Attributes (ISA set) are not actually lowered into the form LLVM understands, but are in some halide-specific metadata. So, to fix the first bug, we must lower the CPU Tune into per-function `"tune-cpu"` metadata, and while there we might as well lower `"target-cpu"` and `"target-features"` similarly. * Address review notes * Hopefully silence bogus issue reported by ancient GCC * Call `set_function_attributes_from_halide_target_options()` when JIT compiling * Fix grammar 19 May 2022, 17:10:53 UTC
61f6af7 Add Func::type()/types(), deprecate Func::output_type()/output_types() (#6772) * rename GIOBase::type() and friends * Func::output_type() -> Func::type() * Add type() forwarders for inputs * Add Func::dimensions() wrapper * Update Func.h 19 May 2022, 16:20:19 UTC
13a5470 Update the list of fused_pairs and run validate_fused_group for specalization definitions too (#6770) * Update the list of fused_pairs and run validate_fused_group for specialization definitions too. Fixes https://github.com/halide/Halide/issues/6763. * Address review comments * Add const to auto& 18 May 2022, 17:30:04 UTC
25a3272 add_python_aot_extension should use FUNCTION_NAME for the .so output … (#6767) add_python_aot_extension should use FUNCTION_NAME for the .so output (otherwise you can't produce multiple aot extensions from the same Generator) 16 May 2022, 17:46:56 UTC
cc41e65 Fix Param<T>::set_estimate for T=void (#6766) * Fix Param<T>::set_estimate for T=void * Add tests 16 May 2022, 17:29:13 UTC
09a986e Expand the x86 SIMD variants tested in correctness_vector_reductions (#6762) A recent bug in LLVM codegen was missed because it only affected x86 architectures with earlier-than-AVX2 SIMD enabled; it didn't show up for AVX2 or later. This revamps correctness_vector_reductions to re-run multiple times when multiple SIMD architectures are available on x86 systems. (correctness_vector_reductions was chosen here because it reliably demonstrated the specific failures in this case.) 13 May 2022, 20:56:47 UTC
4ab4ad9 Minor metadata-related cleanups (#6759) (Harvested from #6757, which probably won't land) - Add clarifying comment/reference in Generator - Add assertion to compile_to_multitarget() function - Fix misleading/wrong code in correctness_compile_to_multitarget 13 May 2022, 01:31:38 UTC
b38b661 Deprecate disable_llvm_loop_opt (#4113) (#6754) This PR proposes to (finally) deprecate disable_llvm_loop_opt: - make LLVM codegen default to no loop optimization; you must use enable_llvm_loop_opt explicitly to enable it - disable_llvm_loop_opt still exists, but does nothing (except issue a user_warning that the feature is deprecated) - Remove various uses of disable_llvm_loop_opt - Add comments everywhere that the default is different in Halide 15 and that the disable_llvm_loop_opt feature will be removed entirely in Halide 16 Note that all Halide code at Google has defaulted to having disable_llvm_loop_opt set for ~years now, so this is a well-tested codepath, and consensus on the Issue seemed to be that this was a good move. 10 May 2022, 20:58:26 UTC
a2e89d8 Add GeneratorFactoryProvider to generate_filter_main() (#6755) * Add GeneratorFactoryProvider to generate_filter_main() This provides hooks to allow overriding the Generator(s) that generate_filter_main() can use; normally it defaults to the global registry of C++ Generators, but this allows for (e.g.) alternate-language-bindings to selectively override this (e.g. to enumerate only Generators that are visible in that language, etc). (No visible change in behavior from this PR; this is just cherry-picked from work-in-progress elsewhere to simplify review & merge) * Update Generator.cpp * Fix error handling 10 May 2022, 01:07:57 UTC
a986078 Deprecate GeneratorContext getters with `get_` prefix (#6753) Minor hygiene: most getters in Halide don't have a `get_` prefix. These are very rarely used (only one instance in our test suite I could find) but, hey, cleanliness. 09 May 2022, 21:38:41 UTC
47d8103 Add a `HalideError` base class to Python bindings (#6750) * Add a `HalideError` base class to Python bindings Per suggestion from @alexreinking, this remaps all exceptions thrown by the Halide Python bindings to be `halide.HalideError` (or a subclass thereof), rather than plain old `RuntimeError`. * Remove scalpel left in patient * Don't use a subclass for PyStub error handling 06 May 2022, 00:49:40 UTC
6fbf203 Update hannk README link to hosted models page (#6749) The current one is being sunsetted 05 May 2022, 16:06:37 UTC
557690e Update WABT to 1.0.29 (#6748) 05 May 2022, 16:06:21 UTC
c8531a5 Silence "may be used uninitialized" in Buffer::for_each_element() (#6747) In at least one version of GCC (Debian 11.2.0-16+build1), an optimized build using `Buffer::for_each_element(int *pos)` will give (incorrect) compiler warnings/errors that "pos may be used uninitialized). From inspection of the code I feel pretty sure this is a false positive -- i.e., the optimizer is confused -- and since no other compiler we've encountered issues a similar warning (nor do we see actual misbehavior), I'm inclined not to worry -- but the warning does break some build configurations. Rather than try to fight with selectively disabling this warning, I'm going to propose inserting a memset() here to reassure the compiler that the memory really is initialized; while it's unnnecessary, it's likely to be insignificant compared to the cost of usual calls to for_each_element(). (BTW, this is not a new issue, I've seen it for quite a while as this GCC is the default on one of my Linux machines... it just finally annoyed me enough to want to make it shut up.) 05 May 2022, 01:17:40 UTC
back to top