https://github.com/halide/Halide

sort by:
Revision Author Date Message Commit Date
c13b818 Merge branch 'main' into srj/aligned-malloc-with-aligned-alloc 13 February 2023, 18:15:41 UTC
22aed20 Devirtualize the protected compile() methods in Codegen_C (#7341) With the addition of `preprocess_function_body()`, neither of these need to be virtual, and devirtualizing them avoid `hidden overloaded virtual function` warnings in subclasses that don't override them 11 February 2023, 02:42:05 UTC
6c5ca8e Tiny improvements in codegen in C backend (#7337) * Tiny improvements in codegen in C backend (1) Emit `true` or `false` instead of `(bool)(0ull)` etc for bool literals (2) Avoid redundant temporaries in print_cast_expr(), which occur in a small but nonzero number of cases Basically this means that code currently like ``` bool _523 = (bool)(0ull); bool _524 = (bool)(_523); ... foo(_524); ``` becomes ``` foo(false); ``` ...I'm sure this has no output on final object code, but it makes the generated C code less weird to read. * Also avoid extra intermediates for typed nullptr * Also use std::isnan() and std::isinf() * Update CodeGen_C.cpp 11 February 2023, 00:34:41 UTC
a6c5be7 Add a hook to Codegen_C::compile() (#7335) At least one subclass of Codegen_C currently has to replicate ~all of the compile(LoweredFunc) method, with the result that it has often gone stale (and still is stale) wrt changes in the base; this adds an optional method to allow some modifications to the function body just before it is printed, to avoid redundant code. 10 February 2023, 21:28:51 UTC
88d40c2 Fix issue in find_package in cross-compilation for no OS (#7282) When using toolchain where Threads libs are not available, which is the case in baremetal target cross-compilation, we were not able to load even HalideHelpers pacakge. Co-authored-by: Alex Reinking <alex.reinking@gmail.com> 10 February 2023, 18:07:56 UTC
35322c3 Fix a subtle uninitialized-memory-read in Buffer::for_each_value() (#7330) * Fix a subtle uninitialized-memory-read in Buffer::for_each_value() When we flattened dimensions in for_each_value_prep(), we would copy from one past the end, meaning the last element contained uninitialized garbage. (This wasn't noticed as an out-of-bounds read because we overallocated in structure in for_each_value_impl()). This garbage stride was later used to advance ptrs in for_each_value_helper()... but only on the final iteration, so even if the ptr was wrong, it didn't matter, as the ptr was never used again. Under certain MSAN configurations, though, the read would be (correctly) flagged as uninitialized. This fixes the MSAN bug, and also (slightly) improves the efficiency by returning the post-flattened number of dimensions, potentially reducing the number of iterations f for_each_value_helper() needed. * Oopsie * Update HalideBuffer.h * Update HalideBuffer.h 10 February 2023, 00:22:10 UTC
ae3f401 Explicitly remove -D_GLIBCXX_ASSERTIONS from LLVM definitions (#7332) Explicitly remove -D_GLIBCXX_ASSERTIONS from LLVM definitions as a workaround for https://reviews.llvm.org/D142279 09 February 2023, 23:08:37 UTC
4156c5a Allow _Float16 as alias for float16_t in halide_type_of<>() (#7325) (#7326) 09 February 2023, 21:51:22 UTC
734e34a Remove deprecated `HVX_shared_object` feature (#7331) This has been marked 'deprecated' for quite a while, and has no affect on codegen or, well, anything else. Let's remove it. 09 February 2023, 17:59:12 UTC
0f6003e Float16: Remove unused header dependency (#7324) IRMutator.h is not needed for the Float16.h. 08 February 2023, 20:26:31 UTC
c3f3318 Fixes for top-of-tree LLVM (#7329) * Fixes for top-of-tree LLVM * fix * times ten * Update LLVM_Output.cpp 08 February 2023, 20:25:37 UTC
ddb515a Improve support for Arm baremetal compilation and runtime (#7286) * Improve support for Arm baremetal compilation and runtime - Add Target feature "semihosting" mode for baremetal runtime - Fix error of aligned_alloc() when compiled by Arm GNU toolchain * Modify comments for Target feature semihosting * Add an example app to guide cross-compilation for baremetal target * Update build steps in HelloBaremetal * Fix line-ending * Set CMake variable BAREMETAL in toolchain file 07 February 2023, 18:41:04 UTC
34d256f Make auto scheduler libs available in HalideHelpers package (#7285) * Make auto scheduler libs available in HalideHelpers package find_package(HalideHelpers) allows us to use add_halide_library(). But auto scheduler libs are not available unless they are in Halide-Interfaces.cmake. Note: Those libraries are not actually linked to the target application, but need to be available for add_custom_command call. 07 February 2023, 18:40:22 UTC
0c7722f Add buffer sync methods hannk::Tensor class (#7323) Add few methods for GPU memory interaction. 07 February 2023, 17:37:09 UTC
0b7379f Warn emulated float16 equivalent is generated (#7307) * Warn emulated float16 equivalent is generated 07 February 2023, 17:08:32 UTC
a55a09a Fix Halide cross-compilation (#7073) Use CMAKE_CROSSCOMPILING_EMULATOR for llvm-as and clang imported targets 07 February 2023, 14:17:58 UTC
1ad328a Fix LLVM 17+ build integration on 32-bit systems (#7322) * Fix LLVM 17+ build integration on 32-bit systems Fixes #7319 * add detail and precision to comment 07 February 2023, 01:18:21 UTC
91f3ac0 Fix segfault by nonconstant bound in Adams2019 (#7321) Fix segmentation fault in Adams2019 in case the estimate or bound of Func is set to nonconstant Expr. 06 February 2023, 22:23:47 UTC
01f9e2d Replace some push_backs with emplace_back (#7317) 06 February 2023, 19:05:01 UTC
e9aecee Make visit_leaf() public in hannk/ops.h (#7318) * Make visit_leaf() public in hannk/ops.h This makes it easier for downstream code to experiment with adding ops * Update ops.h 03 February 2023, 21:17:47 UTC
0782d80 Make Callable::call_argv_fast public (#7315) * Make Callable::call_argv_fast public * Add rough specification of the calling convention * Fix a typo 01 February 2023, 18:24:01 UTC
beba53a halide_popcount<uint64_t> is broken (#7313) Would not compile for Win32 or any other compiler without __builtin_popcountll available. (How did this get checked in without being tested on MSVC?) 31 January 2023, 21:08:15 UTC
fe76ab2 Minimal updates to allow Halide building with LLVM17 (#7309) * Minimal updates to allow Halide building with LLVM17 (Opening as draft initially until Buildbots build the new LLVM versions) * trigger buildbots 30 January 2023, 22:01:42 UTC
dd973f4 Improved halide_popcount (#7225) * Improved halide_popcount * reused popcount64 from Utils.cpp in CodeGen_C * Fixed comment for popcount 25 January 2023, 21:40:54 UTC
810bd0b Hoist vector slices using rewrite rules (#7243) * Hoist slices using rewrite rules This lets us add associative variants more easily, which are helpful in the work on staging strided loads. * Don't hoist extract_element shuffles The Shuffle visitor wants to sink them * Add some static asserts * Add explanatory comment on shuffle hoisting * Fix comment * add lanes predicate to slice hoisting * add vector slice hoisting test cases Co-authored-by: Steven Johnson <srj@google.com> Co-authored-by: Alexander <ajroot@stanford.edu> 21 January 2023, 22:08:30 UTC
bafd60f [x86 & wasm] Split up double saturating-narrows from i32 (#7280) * better x86 double sat-cast + add test * fix wasm too + test Co-authored-by: Steven Johnson <srj@google.com> 20 January 2023, 18:03:25 UTC
c601e4e Add workaround for the const-or-not user_context issue (#635) (#7291) Add a workaround for the const-or-not user_context issue (https://github.com/halide/Halide/issues/635) 20 January 2023, 17:43:56 UTC
2cc0468 Fix issue in add_halide_runtime in cross-compilation (#7284) * Fix issue in add_halide_runtime in cross-compilation add_halide_runtime() tries to build generator executable, but it fails if we are working with cross-compiler toolchain. By using existing generator set as "FROM", we can work around this. 20 January 2023, 17:39:41 UTC
d44e99d Fix error of add_halide_generator in cross-compilation (#7283) In case the project name is CamelCase, add_halide_generator() was not able to find the generator package, because CMake searches <name>Config.cmake or <lower-case-name>-config.cmake 20 January 2023, 13:12:30 UTC
147ff48 Remove dependency on platform threads library (#7297) * Refactor internal ThreadPool.h into halide_thread_pool.h tool * Drop dependency of libHalide on threads library * Remove other redundant uses of Threads::Threads * Update CMake documentation. 20 January 2023, 12:54:34 UTC
314b2fd [HVX] Fix EliminateInterleaves (#7279) * fix EliminateInterleaves Co-authored-by: Steven Johnson <srj@google.com> 20 January 2023, 00:35:14 UTC
c9f3602 Remove the watchdog timer from generator_main(). It was intended to k… (#7295) Remove the watchdog timer from generator_main(). It was intended to kill pathologically slow builds, but in the environment it was added for (Google build servers), it ended up being redundant to existing mechanisms, and removing it allows us to remove a dependency on threading libraries in libHalide. 19 January 2023, 23:48:26 UTC
51a4f6c Emit prototypes for destructor functions in C Backend (#7296) We gathered up the destructors, but only emitted the prototypes if there was at least one non-C++ function declaration needed -- so if you built with cpp_name_mangling enabled, you might omit the right prototype. Fixed and added the right flag to a Generator test to tickle this behavior. 19 January 2023, 23:36:47 UTC
e8e1481 Drop support for MIPS (#7287) (#7289) * Drop support for MIPS (#7287) * Update Target.cpp 18 January 2023, 21:56:14 UTC
888c41c Add CMake support for C++ backend in test/generator (#7274) * Add support for C++ backend in test/generator When the CMake rules were rewritten a while back, the support for building/testing generators with the C++ backend (instead of the standard LLVM, etc) got lost. This adds it back in. Also made some drive-by fixes to the Makefile to enable some tests there that work correctly now. Also made a drive-by fix in in Codegen_C to fix allocation nodes that were just wrappers around buffer_get_host -- this prevented the cleanup_on_error test from building with the C++ backend. 18 January 2023, 00:47:52 UTC
5a55f81 Merge branch 'main' into srj/aligned-malloc-with-aligned-alloc 11 January 2023, 01:27:58 UTC
0d43318 Optimize Module::compile() for some edge cases (#7269) * Optimize Module::compile() for some edge cases Avoid redundant `compile_to_buffer()` calls for output requests that can't possibly ever need them. * Avoid mutation 10 January 2023, 19:23:58 UTC
0ac2a9f Merge branch 'main' into srj/aligned-malloc-with-aligned-alloc 10 January 2023, 17:52:13 UTC
a8d88bb Use ::aligned_alloc() instead of std::aligned_alloc() in HalideBuffer.h (#7268) 10 January 2023, 17:51:38 UTC
3b278a2 Merge branch 'main' into srj/aligned-malloc-with-aligned-alloc 09 January 2023, 18:07:21 UTC
eea7696 Update README_python.md (#7266) 09 January 2023, 17:58:04 UTC
c070bb8 Update change following LLVM WASM change f841ad30d77eeb4c51663e68efefdb734c7a3d07 (#7264) * Update change following LLVM WASM change https://github.com/llvm/llvm-project/commit/f841ad30d77eeb4c51663e68efefdb734c7a3d07 * Update checks conditional on LLVM version. 06 January 2023, 00:00:34 UTC
4b74049 Inline into extern function args during bounds inference (#7261) * Inline into extern function args during bounds inference Fixes #7260 * Run CSE once at the end * Actually recursively inline * clang-tidy * trigger buildbots * Make test invariant to the number of times the warning is printed as long as it's at least once Co-authored-by: Steven Johnson <srj@google.com> 05 January 2023, 21:12:50 UTC
04bb986 Conditional allocations shouldn't fail for size=0 in C++ backend (#7255) (#7256) * Conditional allocations shouldn't fail for size=0 in C++ backend (#7255) Allocations can be conditional; if the condition evaluates to false, we end up calling `halide_malloc(0)` (or `halide_tcm_malloc(0)` in the xtensa branch). Since it's legal via spec for `malloc(0)` to return nullptr, we need to be cautious here: if we are compiling with assertions enabled, *and* have a malloc() (etc) implementation that returns nullptr for alloc(0), we need to skip the assertion check, since we know the result won't be used. Note: a similar check will be inserted in the xtensa branch separately. Note 2: LLVM backend already has this check via Codegen_Posix.cpp * Update CodeGen_C.cpp 28 December 2022, 17:31:01 UTC
a306977 Merge branch 'main' into srj/aligned-malloc-with-aligned-alloc 22 December 2022, 18:46:16 UTC
ade8b56 Remove deprecated halide_target_feature_disable_llvm_loop_opt (#7247) * Remove deprecated halide_target_feature_disable_llvm_loop_opt Was deprecated in Halide 15; let's remove in Halide 16 * trigger buildbots * trigger buildbots * Update CodeGen_LLVM.cpp 20 December 2022, 20:05:19 UTC
10345d4 Explicitly stage strided loads (#7230) * Add a pass to do explicit densification of strided loads * densify more types of strided load * Reorder downsample in local laplacian for slightly better performance * Move allocation padding into the IR. Still WIP. * Simplify concat_bits handling * Use evidence from parent scopes to densify * Disallow padding allocations with custom new expressions * Add test for parent scopes * Remove debugging prints. Avoid nested ramps. * Avoid parent scope loops * Update cmakefiles * Fix for large_buffers * Pad stack allocations too * Restore vld2/3/4 generation on non-Apple ARM chips * Appease clang-format and clang-tidy * Silence clang-tidy * Better comments * Comment improvements * Nuke code that reads out of bounds * Fix stage_strided_loads test * Change strategy for loads from external buffers Some backends don't like non-power-of-two vectors. Do two overlapping half-sized loads and shuffle instead of one funny-sized load. * Add explanatory comment to ARM backend * Fix cpp backend shuffling * Fix missing msan annotations * Magnify heap cost effect in stack_vs_heap performance test * Address review comments * clang-tidy * Fix for when same load node occurs in two different allocate nodes 16 December 2022, 17:56:08 UTC
382f813 Fix "may be used uninitialized" warnings in Codegen_C::print_scalarized_expr() (#7244) 16 December 2022, 17:54:21 UTC
da6746e correctness/exception.cpp needs to check HALIDE_WITH_EXCEPTIONS (fixes #7240) (#7241) correctness/exception.cpp needs to check HALIDE_WITH_EXCEPTIONS 14 December 2022, 05:29:45 UTC
1a4a469 Fix some sources of signed integer overflow in the compiler (#7231) * Fix some sources of signed integer overflow in the compiler Also, use compiler intrinsics when possible to handle overflow, as it generates faster code. * Fix msvc macro * Must use result * Actually perform the requested operation 13 December 2022, 16:11:54 UTC
533e6e5 Remove rogue string suffix in simd_op_check_arm.cpp (#7227) * Remove rogue string suffix in simd_op_check_arm.cpp Interestingly, it compiles here, but in some compilers it will fail with "unexpected token". * Update simd_op_check_arm.cpp 12 December 2022, 20:54:46 UTC
548abef Merge branch 'main' into srj/aligned-malloc-with-aligned-alloc 12 December 2022, 18:28:45 UTC
6ecdcbd Tighten alignment promises for halide_malloc() (#7222) This makes a couple of changes to the behavior/implementation of `halide_malloc()`: * Currently, halide_malloc must return a pointer aligned to the maximum meaningful alignment for the platform for the purpose of vector loads and stores. This PR also adds the requirement that the memory returned must be legal to access in an integral multple of alignment >= the requested size (in other words: you should be able to do vector load/stores "off the end" without causing any faults). * Currently, the `halide_malloc_alignment()` function is used to determine the default alignment; this cannot be overridden by user code (well, it can be, but the override will have no useful effect). It is intended to be "internal only" but is used in at least one place outside the runtime (apps/hannk). This change removes the call entirely, in favor of a call that is harder to access from outside the runtime and much less likely for end users to attempt to call. (It also changes apps/hannk to stop using it.) 11 December 2022, 18:05:55 UTC
16421a7 Revise simd_op_check tests to ignore HL_TARGET (#7207) (#7216) * Revise simd_op_check tests to ignore HL_TARGET (#7207) The simd_op_check tests have historically only run using the value of HL_TARGET, which mean that the coverage they had was low (since HL_TARGET is only set to values that are runnable on at least one buildbot). This change completely disconnects these tests from HL_TARGET; instead, each test now tests for a range of targets appropriate to the architecture being tested. On all platforms, they still compile to assembly and verify that the correct instructions are generated; additionally, if the host platform can JIT for the given target, it verifies that the results are as expected. * Update simd_op_check_riscv.cpp * Update simd_op_check_x86.cpp * Update simd_op_check_x86.cpp * Update simd_op_check_arm.cpp * Add more features that must match; re-enable the bfloat instructions * Update simd_op_check_x86.cpp * Update simd_op_check_riscv.cpp * trigger buildbots * Fix simd_op_check_wasm 09 December 2022, 17:21:30 UTC
ba31688 Increase __clang_major__ check in Float16.h to 16 (#7224) 09 December 2022, 01:22:54 UTC
5faa3d2 Use `aligned_alloc()` when possible Currently, all our `halide_malloc()` implementations just use `malloc()`/`free()`, user overallocation tricks to ensure the right alignment. This PR adds a new implementation, which uses the C11/C++17 `aligned_alloc()` call instead. By default, we use this implementation on all Unixy platforms, with a new Feature, `no_aligned_alloc`, to allow forcing the use of `malloc()` instead. This is necessary because while ~all modern Linux versions support this, Android doesn't support it till API >= 28, and OSX doesn't support it till >= 10.15. (The QuRT allocator will continue to use `malloc()` for now, pending some post-holiday investigation by QC.) We also add a Windows-specific variant that uses their `_aligned_malloc()`/`_aligned_free()` calls; IIRC, the MSVC team has stated that they are unlikely to ever support the standard `aligned_alloc()` calls, for reasons that aren't important here, but do support these as a partial workaround. This will likely need some torture testing, since it's possible that some platforms offer `aligned_alloc()` implementations that have inferior performance to `malloc()`. 09 December 2022, 00:02:30 UTC
22a302b Tighten alignment promises for halide_malloc() This makes a couple of changes to the behavior/implementation of `halide_malloc()`: * Currently, halide_malloc must return a pointer aligned to the maximum meaningful alignment for the platform for the purpose of vector loads and stores. This PR also adds the requirement that the memory returned must be legal to access in an integral multple of alignment >= the requested size (in other words: you should be able to do vector load/stores "off the end" without causing any faults). * Currently, the `halide_malloc_alignment()` function is used to determine the default alignment; this cannot be overridden by user code (well, it can be, but the override will have no useful effect). It is intended to be "internal only" but is used in at least one place outside the runtime (apps/hannk). This change removes the call entirely, in favor of a call that is harder to access from outside the runtime and much less likely for end users to attempt to call. (It also changes apps/hannk to stop using it.) 08 December 2022, 23:59:07 UTC
066559b Remove check_jit_user_context() from V8 bindings (#7220) Obsolete code from early V8 work, it can trigger inappropriately in some corner-case scenarios. Remove it entirely to avoid false errors. 08 December 2022, 23:35:01 UTC
8fa8221 Fix bonehead version-checking test in HalideBuffer.h for Apple (#7218) 08 December 2022, 04:34:13 UTC
e8615bb clang-tidy: add [[maybe-unused]] to the DECLARE_NO_INITMOD stubs. (#7215) 08 December 2022, 01:22:38 UTC
a7fa32e Use aligned_alloc() as default allocator for HalideBuffer.h on most platforms (#7190) Use aligned_alloc() as default allocator for HalideBuffer.h on most platforms (See also https://github.com/halide/Halide/pull/7189) Modify H::R::Buffer to default to using `aligned_alloc()` instead of `malloc()`, except: - If user code passes a non-null `allocate_fn` or `deallocate_fn`, we always use those (and/or malloc/free) - If the code is compiling under MSVC, never use `aligned_alloc` (Windows doesn't support it) - If HALIDE_RUNTIME_BUFFER_USE_ALIGNED_ALLOC is defined to be 0, never use `aligned_alloc` (this is to allow for usage on e.g. older Android and OSX versions which don't provide `aligned_alloc()` in the stdlib, regardless of C++ versions.) Also, as with #7189, this ensures that the allocated space has the start of the host data as 128-aligned, and also now ensures that the size allocated 128-aligned (rounding up as needed). 07 December 2022, 17:31:01 UTC
8ce1212 Fix bitrot in PowerPC testing (#7211) * Fix bitrot in PowerPC testing (See #7208) - DataLayout was wrong (and has been for a long time) - simd_op_check_powerpc had errors. Some were easy to fix; the rest I commented out with a TODO since this backend doesn't appear to be in active use. (Want to fix this in preparation for fixing #7207) * Move x86 absd tests to the right place Co-authored-by: Andrew Adams <andrew.b.adams@gmail.com> 07 December 2022, 17:29:19 UTC
35020c5 Extend LLVM IR type mangling to handle scalars. (#7212) Extend LLVM IR type mangling to handle scalars and use this in vector predication intrinsic codegen. Fixes an error denerating vector predicated strided stores. 07 December 2022, 07:15:46 UTC
d4b4c50 Add RISC V zvl flag for LLVM version 16 or greater. (#7209) 07 December 2022, 07:15:10 UTC
e0d1e15 Fix issue with vector predicated comparison and select instructions. (#7205) Fix invalid LLVM IR issues with vector predicated comparison and select instructions. Add start of RISC V simd_op_check test. 07 December 2022, 00:58:28 UTC
59f5412 Add bridging for clang _Float16 type. (#7201) Add type bridging between Halide::float16_t and _Float16 if the compiler supports the latter. Testing is done using clang specific logic and may need to be extended for other compilers. I chose not to add support for __fp16 and __bf16 right now as __fp16 is less useful in being storage only and __bf16 also only supports a subset of operations and was running into undefined symbols during compilation that did not look promising. Co-authored-by: Steven Johnson <srj@google.com> 06 December 2022, 22:57:24 UTC
90459b0 Revert "Fix for top-of-tree LLVM" (#7200) Revert "Fix for top-of-tree LLVM (#7194)" This reverts commit a9ea9b565018774e52bb4028cbc91e14cb86959e. 06 December 2022, 00:53:05 UTC
345cf18 Don't attempt to use makecontext()/swapcontext() on Android (#7196) Despite being 'posixy', it doesn't actually implement these calls. 02 December 2022, 20:48:57 UTC
a9ea9b5 Fix for top-of-tree LLVM (#7194) 02 December 2022, 00:17:48 UTC
43911f4 Add a -v flag to generator_main() (#7193) This is a simple thing that just logs the path to all generated file(s) to stdout if `-v=1` is specified. It's intended for people running Generators directly from the commandline, and is intended as a more user-friendly alternative to HL_DEBUG_CODEGEN=1. No makefiles, etc specify it at present, but I anticipate using it in some tooling in the future. Example usage: ``` $ resize_image_bilinear.generator_binary -v 1 -o /tmp -g resize_image_bilinear -n resize_image_bilinear_uint16 -f resize_image_bilinear_uint16 -e assembly,c_header,llvm_assembly,registration,static_library,stmt 'target=arm-64-android' 'input.type=uint16' 'output.type=uint16' Generated file: /tmp/resize_image_bilinear_uint16.s Generated file: /tmp/resize_image_bilinear_uint16.h Generated file: /tmp/resize_image_bilinear_uint16.ll Generated file: /tmp/resize_image_bilinear_uint16.registration.cpp Generated file: /tmp/resize_image_bilinear_uint16.a Generated file: /tmp/resize_image_bilinear_uint16.stmt ``` 01 December 2022, 18:20:03 UTC
5a8c324 Fix metadata generation for multitarget Generators (#7181) Fix metadata generation for multitarget Generators We had a mechanism in place to ensure that Outputs that got renamed during lowering still emitted the proper names in the metadata... but this didn't work reliably for Multitarget generation. Now it does. 30 November 2022, 01:44:30 UTC
caf4b71 Disable unreachable-code clang-tidy warnings (#7182) Some configurations of clang-tidy will (correctly) complain that the code inside the `if` clauses here will never be executed, since it ends up as something like `if (strcmp("foo", "foo"))`... but for testing purposes, we want to keep it, for obvious reasons. It's hard to construct a string-compare here as constexpr, so I'm just going to NOLINT it. Also changed the `count_buffers()` check to a static_assert for simplicity. 29 November 2022, 17:45:49 UTC
2cfc315 Tweak the import paths in Python apps & tests (#7179) * Tweak the import paths in Python apps & tests This change makes it a bit easier for me to transform the import paths when merging into Google: we can't set PYTHONPATH, and calling `sys.path.append()` is frowned upon. This should have no effect on the GitHub repo but will make my life easier downstream. * More tweaks * force builds * Update Generator.cpp 28 November 2022, 22:11:35 UTC
73c61c3 Add optional "function_info" header output (#7170) Add optional "function_info" header output At first glance, this looks like a subset of what is already provided by the `_metadata()` functionality: describing the argument attributes of an AOT-generated Halide function. However, _metadata() is suboptimal for some use cases: Because it's expressed as ordinary data, we can only process it at runtime; the new fuctionality is expressed as a `constexpr` data structure, meaning we can process it at *compile* time if we so choose. (This is quite useful for producing automatic call wrappers, etc). At first I considered adding this to the normal `.h` file, but moving it into a new file is cleaner in a few ways: - It maintains the 'C-only' nature of the existing .h files (adding this would have imposed a C++17-only section on them) - Splitting into a new file means no existing users are affected by this change at all Note also that this is deliberately not replicating all of the existing `_metadata()` functionality (it's just the argument signature, but no e.g. estimates or default values, etc). This approach means that it is probably more sensible to add several separate constexpr "getters" to this file, rather than trying to mash everything together into one clumsy structure. (With _metadata(), there was an incentive to keep the surface area of the API small, even if that meant combining somewhat-unrelated concerns; there is no such incentive here.) 28 November 2022, 21:52:51 UTC
3ff9e66 Use n32:64 in RISC-V data layout (#7175) * Use n32:64 in RISC-V data layout * Remove unused LLVM header 28 November 2022, 20:00:58 UTC
81c79d5 README_python.md should be installed with other READMEs (#7177) 28 November 2022, 19:12:11 UTC
270c24a Migrate from MCJIT to ORC JIT (#7166) * Migrate from MCJIT to ORC JIT 18 November 2022, 18:52:51 UTC
7b0fdf5 Add fopen() bottleneck to runtime (#7171) * Add fopen() bottleneck to runtime Prefer using `fopen64()` on Linux systems. Also, drive-by sorting of the list of initmods that was supposed to be kept sorted. * fopen_32 -> fopen, fopen_64 -> fopen_lfs 18 November 2022, 17:08:56 UTC
be055a8 Slightly improve error message for non-integer RDom min/extent (#7151) Improve error message for non-integer RDom min/extent Co-authored-by: Steven Johnson <srj@google.com> 16 November 2022, 01:02:37 UTC
41fe8b3 Factor simd_op_check into separate files by architecture. (#7163) 11 November 2022, 03:32:07 UTC
9916b4e Add `bfloat` support to `halide_type_to_string()` (#7154) 08 November 2022, 16:54:25 UTC
58421be Call cache.clear between internal functions in CG_C (#7155) We didn't call cache.clear() between internal functions in the C backend, so the cache could try to re-use something declared in a previous (internal, closure) function and would fail to compile. Easy fix. (I'm surprised we haven't seen this fail before now.) 08 November 2022, 16:54:15 UTC
c6815b0 C Backend should call halide_buffer_to_string() (#7156) Just assume that this is present and call it for stringify() on buffers in the C backend. (If it's missing, the user will be expected to provide an implementation, as is usual for runtime with the C backend.) 08 November 2022, 16:53:43 UTC
102c059 Fix readnone attribute for llvm 16 (#7152) * Fix readnone attribute for llvm 16 The readnone flag was changed to memory(none) when applied to functions. llvm-as dynamically upgrades readnone applied to functions, so our .ll is fine for now, but there were places in the compiler we were manually sticking 'readnone' on a function. Also did a driveby makefile fix to remove some vestigial wasm stuff that was throwing errors with newer versions of llvm-config * Revert formatting changes 08 November 2022, 00:38:44 UTC
8f8edeb Don't use TF_LITE_KERNEL_LOG in apps/hannk (#7147) TF_LITE_KERNEL_LOG was intended for TFLite Micro but usage leaked out into example code; we should use ReportError instead. 03 November 2022, 20:37:51 UTC
1230042 Fix Python wheel-building (#7144) Various bits of code rearrangement had invalidated some of the build scripts for Python wheels for our bindings; this fixes that, and also subtracts some other irrelevant stuff that was getting included (e.g. the stub directory). Also updated the "long description" to use README_python.md rather than README.md. 02 November 2022, 00:10:51 UTC
d3e9d85 Upgrade some Actions in pip.yml (#7141) Needed to avoid deprecation warnings 01 November 2022, 22:11:16 UTC
b676567 Bump Halide version in main's setup.py to 16 (#7142) 01 November 2022, 22:10:52 UTC
bb7715a Move Python apps to toplevel of python_bindings -- they don't belong … (#7140) * Move Python apps to toplevel of python_bindings -- they don't belong under test/ * Update CMakeLists.txt 01 November 2022, 20:13:13 UTC
115f67a Give pip.yml permission to read packages (#7139) 01 November 2022, 16:12:26 UTC
4987365 Rewrite python_bindings/apps (#7133) * apps * wip * WIP 2 * Fix comments * _GPU_SCHEDULE_ENUM_MAP * Update blur_generator.py * Add hl.funcs, hl.vars, plus formatting tweaks 31 October 2022, 22:22:26 UTC
e6066ac halide.imageio needs to support arbitrary bufferviews (#7137) * halide.imageio needs to support arbitrary bufferviews As written, the helper code assumed that everything passed in was a numpy array of some sort; this meant that passing hl.Buffer didn't work. Restructured so that we only assume that the objects passed in satisfies the Python buffer protocol, so this should now work very generically. * Update imageio.py * More fixes 31 October 2022, 20:07:41 UTC
5da5dfd [x86] Generate AVX512 fixed-point instructions (#7129) * clean-up abs and saturating_pmulhrs, fix AVX512 saturating_ ops * add test coverage for AVX512 fp ops * generate vpabs on AVX512 * faster AVX2 lowering of saturating_pmulhrs 31 October 2022, 18:36:41 UTC
bad945f Apply 'Black' formatter to py/test/correctness and py/test/generators (#7135) * Apply 'Black' formatter to py/test/correctness and py/test/generators Trying to regularize all our Python code to a common style. Should be no functional changes here, just autoformatting + a few tweaks. * Update complexpy_generator.py 31 October 2022, 16:57:09 UTC
0c03ff8 GitHub Workflows security hardening (#7136) build: harden pip.yml permissions Signed-off-by: Alex <aleksandrosansan@gmail.com> Signed-off-by: Alex <aleksandrosansan@gmail.com> 31 October 2022, 16:22:53 UTC
bd15cee [WASM] Use rounding_mul_shift_right for q15mulr_sat_s pattern (#7134) Use rounding_mul_shift_right for WASM q15mulr_sat_s pattern 29 October 2022, 21:19:47 UTC
2f1587e Fix Python buffer handling (#7125) * Fix Python buffer handling In the category of "how did this ever work"... TL;DR: in general, Halide Buffers have the opposite axis ordering from Python/NumPy buffers; in Halide, the most-frequently-varying dimension comes first, while in Python, it comes last. This isn't surprising, though, since Halide's indexing scheme is effectively column-major while NumPy's is row-major. Anyway: what we *should* have done was to reverse the order of dimensions when converting to/from Halide Buffers vs Python buffers; instead, we kept the same order, then jumped thru hoops to rearrange buffers to fit this setup. This PR does the appropriate axis reordering, fixing the apps and tests as needed. It also adds some helper code for image reading and writing; by default, we use `imageio` for this, but imageio ~always wants RGB/RGBA images to be interleaved (vs the planar that Halide prefers). So, I added the `halide.imageio` package, that has wrapper functions to quietly convert to/from planar as needed. Needless to say, this change is likely to break existing code that is using 3d buffers in Halide, but I think it's the right long-term thing to do. Opinions greatly welcomed here. * Update PyBuffer.cpp * -"for better vectorization" * public halide.imageio utilities should copy() buffers * PEP8 * Update imageio.py * Update imageio.py * add 'reverse_axes' options to Buffer conversions (#7127) * add 'reverse_axes' options to Buffer conversions 28 October 2022, 00:18:41 UTC
48345d9 Add range-checking to Buffer objects in Python (#7128) using () to get or set a Buffer element wasn't being checked at runtime for Python, but it clearly should be, because Python. (Note that in C++ we don't always range-check for these operations -- it's limited to `assert()` checks -- but in Python the expectations are clearly different.) 27 October 2022, 02:22:38 UTC
da87cb2 RISC V vector predication support intrinsics support (#7119) Turn on vector predication support for RISC V. (First architecture to use this code. Bug fixes included here.) Add architecture specific vector intrinsics support as well. Should not affect anything outside of RISC V. 26 October 2022, 22:12:28 UTC
fd63349 Require Python 3.8+ in CMake build (#7117) * Require Python 3.8+ in CMake build * Update CMakeLists.txt * Update CMakeLists.txt 25 October 2022, 17:29:07 UTC
back to top