Revision history - HEAD - snapshot: 2c68c8bd649bf1bd2cf3bf7bd4f98d247b82b5dc

swh:1:snp:2c68c8bd649bf1bd2cf3bf7bd4f98d247b82b5dc

Revision	Author	Date	Message	Commit Date
a9ea9b5	Steven Johnson	02 December 2022, 00:17:48 UTC	Fix for top-of-tree LLVM (#7194)	02 December 2022, 00:17:48 UTC
43911f4	Steven Johnson	01 December 2022, 18:20:03 UTC	Add a -v flag to generator_main() (#7193) This is a simple thing that just logs the path to all generated file(s) to stdout if `-v=1` is specified. It's intended for people running Generators directly from the commandline, and is intended as a more user-friendly alternative to HL_DEBUG_CODEGEN=1. No makefiles, etc specify it at present, but I anticipate using it in some tooling in the future. Example usage: ``` $ resize_image_bilinear.generator_binary -v 1 -o /tmp -g resize_image_bilinear -n resize_image_bilinear_uint16 -f resize_image_bilinear_uint16 -e assembly,c_header,llvm_assembly,registration,static_library,stmt 'target=arm-64-android' 'input.type=uint16' 'output.type=uint16' Generated file: /tmp/resize_image_bilinear_uint16.s Generated file: /tmp/resize_image_bilinear_uint16.h Generated file: /tmp/resize_image_bilinear_uint16.ll Generated file: /tmp/resize_image_bilinear_uint16.registration.cpp Generated file: /tmp/resize_image_bilinear_uint16.a Generated file: /tmp/resize_image_bilinear_uint16.stmt ```	01 December 2022, 18:20:03 UTC
5a8c324	Steven Johnson	30 November 2022, 01:44:30 UTC	Fix metadata generation for multitarget Generators (#7181) Fix metadata generation for multitarget Generators We had a mechanism in place to ensure that Outputs that got renamed during lowering still emitted the proper names in the metadata... but this didn't work reliably for Multitarget generation. Now it does.	30 November 2022, 01:44:30 UTC
caf4b71	Steven Johnson	29 November 2022, 17:45:49 UTC	Disable unreachable-code clang-tidy warnings (#7182) Some configurations of clang-tidy will (correctly) complain that the code inside the `if` clauses here will never be executed, since it ends up as something like `if (strcmp("foo", "foo"))`... but for testing purposes, we want to keep it, for obvious reasons. It's hard to construct a string-compare here as constexpr, so I'm just going to NOLINT it. Also changed the `count_buffers()` check to a static_assert for simplicity.	29 November 2022, 17:45:49 UTC
2cfc315	Steven Johnson	28 November 2022, 22:11:35 UTC	Tweak the import paths in Python apps & tests (#7179) * Tweak the import paths in Python apps & tests This change makes it a bit easier for me to transform the import paths when merging into Google: we can't set PYTHONPATH, and calling `sys.path.append()` is frowned upon. This should have no effect on the GitHub repo but will make my life easier downstream. * More tweaks * force builds * Update Generator.cpp	28 November 2022, 22:11:35 UTC
73c61c3	Steven Johnson	28 November 2022, 21:52:51 UTC	Add optional "function_info" header output (#7170) Add optional "function_info" header output At first glance, this looks like a subset of what is already provided by the `_metadata()` functionality: describing the argument attributes of an AOT-generated Halide function. However, _metadata() is suboptimal for some use cases: Because it's expressed as ordinary data, we can only process it at runtime; the new fuctionality is expressed as a `constexpr` data structure, meaning we can process it at compile time if we so choose. (This is quite useful for producing automatic call wrappers, etc). At first I considered adding this to the normal `.h` file, but moving it into a new file is cleaner in a few ways: - It maintains the 'C-only' nature of the existing .h files (adding this would have imposed a C++17-only section on them) - Splitting into a new file means no existing users are affected by this change at all Note also that this is deliberately not replicating all of the existing `_metadata()` functionality (it's just the argument signature, but no e.g. estimates or default values, etc). This approach means that it is probably more sensible to add several separate constexpr "getters" to this file, rather than trying to mash everything together into one clumsy structure. (With _metadata(), there was an incentive to keep the surface area of the API small, even if that meant combining somewhat-unrelated concerns; there is no such incentive here.)	28 November 2022, 21:52:51 UTC
3ff9e66	Dmitry Kurtaev	28 November 2022, 20:00:58 UTC	Use n32:64 in RISC-V data layout (#7175) * Use n32:64 in RISC-V data layout * Remove unused LLVM header	28 November 2022, 20:00:58 UTC
81c79d5	Steven Johnson	28 November 2022, 19:12:11 UTC	README_python.md should be installed with other READMEs (#7177)	28 November 2022, 19:12:11 UTC
270c24a	Dmitry Kurtaev	18 November 2022, 18:52:51 UTC	Migrate from MCJIT to ORC JIT (#7166) * Migrate from MCJIT to ORC JIT	18 November 2022, 18:52:51 UTC
7b0fdf5	Steven Johnson	18 November 2022, 17:08:56 UTC	Add fopen() bottleneck to runtime (#7171) * Add fopen() bottleneck to runtime Prefer using `fopen64()` on Linux systems. Also, drive-by sorting of the list of initmods that was supposed to be kept sorted. * fopen_32 -> fopen, fopen_64 -> fopen_lfs	18 November 2022, 17:08:56 UTC
be055a8	Andrew Adams	16 November 2022, 01:02:37 UTC	Slightly improve error message for non-integer RDom min/extent (#7151) Improve error message for non-integer RDom min/extent Co-authored-by: Steven Johnson <srj@google.com>	16 November 2022, 01:02:37 UTC
41fe8b3	Zalman Stern	11 November 2022, 03:32:07 UTC	Factor simd_op_check into separate files by architecture. (#7163)	11 November 2022, 03:32:07 UTC
9916b4e	Steven Johnson	08 November 2022, 16:54:25 UTC	Add `bfloat` support to `halide_type_to_string()` (#7154)	08 November 2022, 16:54:25 UTC
58421be	Steven Johnson	08 November 2022, 16:54:15 UTC	Call cache.clear between internal functions in CG_C (#7155) We didn't call cache.clear() between internal functions in the C backend, so the cache could try to re-use something declared in a previous (internal, closure) function and would fail to compile. Easy fix. (I'm surprised we haven't seen this fail before now.)	08 November 2022, 16:54:15 UTC
c6815b0	Steven Johnson	08 November 2022, 16:53:43 UTC	C Backend should call halide_buffer_to_string() (#7156) Just assume that this is present and call it for stringify() on buffers in the C backend. (If it's missing, the user will be expected to provide an implementation, as is usual for runtime with the C backend.)	08 November 2022, 16:53:43 UTC
102c059	Andrew Adams	08 November 2022, 00:38:44 UTC	Fix readnone attribute for llvm 16 (#7152) * Fix readnone attribute for llvm 16 The readnone flag was changed to memory(none) when applied to functions. llvm-as dynamically upgrades readnone applied to functions, so our .ll is fine for now, but there were places in the compiler we were manually sticking 'readnone' on a function. Also did a driveby makefile fix to remove some vestigial wasm stuff that was throwing errors with newer versions of llvm-config * Revert formatting changes	08 November 2022, 00:38:44 UTC
8f8edeb	Steven Johnson	03 November 2022, 20:37:51 UTC	Don't use TF_LITE_KERNEL_LOG in apps/hannk (#7147) TF_LITE_KERNEL_LOG was intended for TFLite Micro but usage leaked out into example code; we should use ReportError instead.	03 November 2022, 20:37:51 UTC
1230042	Steven Johnson	02 November 2022, 00:10:51 UTC	Fix Python wheel-building (#7144) Various bits of code rearrangement had invalidated some of the build scripts for Python wheels for our bindings; this fixes that, and also subtracts some other irrelevant stuff that was getting included (e.g. the stub directory). Also updated the "long description" to use README_python.md rather than README.md.	02 November 2022, 00:10:51 UTC
d3e9d85	Steven Johnson	01 November 2022, 22:11:16 UTC	Upgrade some Actions in pip.yml (#7141) Needed to avoid deprecation warnings	01 November 2022, 22:11:16 UTC
b676567	Steven Johnson	01 November 2022, 22:10:52 UTC	Bump Halide version in main's setup.py to 16 (#7142)	01 November 2022, 22:10:52 UTC
bb7715a	Steven Johnson	01 November 2022, 20:13:13 UTC	Move Python apps to toplevel of python_bindings -- they don't belong … (#7140) * Move Python apps to toplevel of python_bindings -- they don't belong under test/ * Update CMakeLists.txt	01 November 2022, 20:13:13 UTC
115f67a	Steven Johnson	01 November 2022, 16:12:26 UTC	Give pip.yml permission to read packages (#7139)	01 November 2022, 16:12:26 UTC
4987365	Steven Johnson	31 October 2022, 22:22:26 UTC	Rewrite python_bindings/apps (#7133) * apps * wip * WIP 2 * Fix comments * _GPU_SCHEDULE_ENUM_MAP * Update blur_generator.py * Add hl.funcs, hl.vars, plus formatting tweaks	31 October 2022, 22:22:26 UTC
e6066ac	Steven Johnson	31 October 2022, 20:07:41 UTC	halide.imageio needs to support arbitrary bufferviews (#7137) * halide.imageio needs to support arbitrary bufferviews As written, the helper code assumed that everything passed in was a numpy array of some sort; this meant that passing hl.Buffer didn't work. Restructured so that we only assume that the objects passed in satisfies the Python buffer protocol, so this should now work very generically. * Update imageio.py * More fixes	31 October 2022, 20:07:41 UTC
5da5dfd	Alexander Root	31 October 2022, 18:36:41 UTC	[x86] Generate AVX512 fixed-point instructions (#7129) * clean-up abs and saturating_pmulhrs, fix AVX512 saturating_ ops * add test coverage for AVX512 fp ops * generate vpabs on AVX512 * faster AVX2 lowering of saturating_pmulhrs	31 October 2022, 18:36:41 UTC
bad945f	Steven Johnson	31 October 2022, 16:57:09 UTC	Apply 'Black' formatter to py/test/correctness and py/test/generators (#7135) * Apply 'Black' formatter to py/test/correctness and py/test/generators Trying to regularize all our Python code to a common style. Should be no functional changes here, just autoformatting + a few tweaks. * Update complexpy_generator.py	31 October 2022, 16:57:09 UTC
0c03ff8	Alex	31 October 2022, 16:22:53 UTC	GitHub Workflows security hardening (#7136) build: harden pip.yml permissions Signed-off-by: Alex <aleksandrosansan@gmail.com> Signed-off-by: Alex <aleksandrosansan@gmail.com>	31 October 2022, 16:22:53 UTC
bd15cee	Alexander Root	29 October 2022, 21:19:47 UTC	[WASM] Use rounding_mul_shift_right for q15mulr_sat_s pattern (#7134) Use rounding_mul_shift_right for WASM q15mulr_sat_s pattern	29 October 2022, 21:19:47 UTC
2f1587e	Steven Johnson	28 October 2022, 00:18:41 UTC	Fix Python buffer handling (#7125) * Fix Python buffer handling In the category of "how did this ever work"... TL;DR: in general, Halide Buffers have the opposite axis ordering from Python/NumPy buffers; in Halide, the most-frequently-varying dimension comes first, while in Python, it comes last. This isn't surprising, though, since Halide's indexing scheme is effectively column-major while NumPy's is row-major. Anyway: what we should have done was to reverse the order of dimensions when converting to/from Halide Buffers vs Python buffers; instead, we kept the same order, then jumped thru hoops to rearrange buffers to fit this setup. This PR does the appropriate axis reordering, fixing the apps and tests as needed. It also adds some helper code for image reading and writing; by default, we use `imageio` for this, but imageio ~always wants RGB/RGBA images to be interleaved (vs the planar that Halide prefers). So, I added the `halide.imageio` package, that has wrapper functions to quietly convert to/from planar as needed. Needless to say, this change is likely to break existing code that is using 3d buffers in Halide, but I think it's the right long-term thing to do. Opinions greatly welcomed here. * Update PyBuffer.cpp * -"for better vectorization" * public halide.imageio utilities should copy() buffers * PEP8 * Update imageio.py * Update imageio.py * add 'reverse_axes' options to Buffer conversions (#7127) * add 'reverse_axes' options to Buffer conversions	28 October 2022, 00:18:41 UTC
48345d9	Steven Johnson	27 October 2022, 02:22:38 UTC	Add range-checking to Buffer objects in Python (#7128) using () to get or set a Buffer element wasn't being checked at runtime for Python, but it clearly should be, because Python. (Note that in C++ we don't always range-check for these operations -- it's limited to `assert()` checks -- but in Python the expectations are clearly different.)	27 October 2022, 02:22:38 UTC
da87cb2	Zalman Stern	26 October 2022, 22:12:28 UTC	RISC V vector predication support intrinsics support (#7119) Turn on vector predication support for RISC V. (First architecture to use this code. Bug fixes included here.) Add architecture specific vector intrinsics support as well. Should not affect anything outside of RISC V.	26 October 2022, 22:12:28 UTC
fd63349	Steven Johnson	25 October 2022, 17:29:07 UTC	Require Python 3.8+ in CMake build (#7117) * Require Python 3.8+ in CMake build * Update CMakeLists.txt * Update CMakeLists.txt	25 October 2022, 17:29:07 UTC
9163310	Zalman Stern	25 October 2022, 05:24:32 UTC	Add support for generating LLVM vector predication intrinsics. (#7111) Add support for generating llvm.vp.* intrinsics. This is particularly useful for RISC V, but it may be a simpler, better optimized path, for Halide vector operations in general. Add support for a maximum vector size that might be larger than the native vector size. RISC V vector LMUL support is an example of an architecture supporting this.	25 October 2022, 05:24:32 UTC
44102c0	Steven Johnson	24 October 2022, 16:37:40 UTC	Add evaluate() and evaluate_may_gpu() to Python bindings (#7108) * Add evaluate() and evaluate_may_gpu() to Python bindings * pacify clang-tidy	24 October 2022, 16:37:40 UTC
5ade1fb	Steven Johnson	21 October 2022, 17:26:49 UTC	Attempt to fix pip build issues (#7098)	21 October 2022, 17:26:49 UTC
8204b05	Steven Johnson	21 October 2022, 17:17:53 UTC	Minor updates to apps/hannk (#7110) * Update hannk to TFLite 2.8.3 * Newer Android NDK use llvm-ar * Avoid 'unscheduled' warning for Elementwise	21 October 2022, 17:17:53 UTC
da22f6f	Andrew Adams	20 October 2022, 19:26:46 UTC	Fix some dead links to the 'master' branch (#7107)	20 October 2022, 19:26:46 UTC
4f7100b	Steven Johnson	20 October 2022, 00:42:58 UTC	Fix subtle CMake Install bugs (#7103) * Update CMakeLists.txt * Update CMakeLists.txt	20 October 2022, 00:42:58 UTC
5256aa6	Steven Johnson	18 October 2022, 04:14:26 UTC	Remove HALIDE_ALLOW_LEGACY_AUTOSCHEDULER_API (#7096) * Remove HALIDE_ALLOW_GENERATOR_EXTERNAL_CODE * Remove HALIDE_ALLOW_LEGACY_AUTOSCHEDULER_API * clang-format * Update CMakeLists.txt	18 October 2022, 04:14:26 UTC
83ccd8e	Steven Johnson	18 October 2022, 04:14:07 UTC	Revert "Update pip.yml to use LLVM 15.0.2" (#7099) Revert "Update pip.yml to use LLVM 15.0.2 (#7097)" This reverts commit 26b1f3c938b07e6fc73f34e5af223c0eaf8e909f.	18 October 2022, 04:14:07 UTC
8f210f1	Steven Johnson	17 October 2022, 23:02:40 UTC	Remove HALIDE_ALLOW_GENERATOR_EXTERNAL_CODE (#7094)	17 October 2022, 23:02:40 UTC
bb6092b	Steven Johnson	17 October 2022, 23:01:13 UTC	Remove everything flagged with HALIDE_ATTRIBUTE_DEPRECATED (#7095)	17 October 2022, 23:01:13 UTC
6702d86	Steven Johnson	17 October 2022, 22:52:03 UTC	Update Halide main branch to v16 (#7093) * Update Halide main branch to v16 * Drop support for LLVM13 in main Now that release/15.x has branched, main is now Halide 16 and no longer needs to support LLVM13. Update the docs, prune the requirements, eliminate old special cases we don't need anymore. * Revert mistaken changes * Update LLVM_Headers.h	17 October 2022, 22:52:03 UTC
26b1f3c	Steven Johnson	17 October 2022, 22:37:48 UTC	Update pip.yml to use LLVM 15.0.2 (#7097) Newly released since this script was written	17 October 2022, 22:37:48 UTC
e70b7d9	Volodymyr Kysenko	17 October 2022, 22:07:53 UTC	Generate dot() in the Metal backend (#7085) * dot() support for Metal backend) * Restrict dot() to floats	17 October 2022, 22:07:53 UTC
0a04beb	Steven Johnson	17 October 2022, 18:31:15 UTC	Update README.md (#7091)	17 October 2022, 18:31:15 UTC
eb2e336	Steven Johnson	17 October 2022, 17:11:05 UTC	Fix #7076, #7077 (#7080) * Fix issue 7076 * fixes * Fixes * Update ScheduleFunctions.cpp * Update ScheduleFunctions.cpp * Update ScheduleFunctions.cpp	17 October 2022, 17:11:05 UTC
8e2cbe0	Alexander Root	14 October 2022, 18:40:17 UTC	[HVX] Fix DistributeShiftsAsMuls (#7083) Fix DistributeShiftsAsMuls	14 October 2022, 18:40:17 UTC
23e22cc	Alex Reinking	13 October 2022, 02:04:56 UTC	Add pip packaging workflow to GHA (#6938) * Add pip packaging workflow * Try adding steps for Windows/macOS * Add CMake/Ninja to runners. * Use MSVC on Windows * Try caching LLVM build * Split LLVM build into separate job * Fix CI * Fix CI * Debugging * Fix path? * Tar for faster transfer and correct perms * Don't build Halide as Universal 2 * Try using ClangCL on Windows * hack in support for universal2 builds * go back to universal2 * add universal2 to CIBW_ARCHS_MACOS * Update workflow to publish to PyPI * Add package metadata * Last fixup before merge	13 October 2022, 02:04:56 UTC
e85b880	Steven Johnson	06 October 2022, 21:20:44 UTC	pacify clang-tidy by removing unused "using" (#7071)	06 October 2022, 21:20:44 UTC
7442ee6	Steven Johnson	06 October 2022, 18:12:37 UTC	Autoscheduler test reorg, part 3 (#7067) * Autoscheduler test reorg, part 3	06 October 2022, 18:12:37 UTC
7086c6f	Steven Johnson	06 October 2022, 18:11:25 UTC	Autoscheduler test reorg, part 2 (#7065) * Autoscheduler test reorg, part 2 Move the Li2018 tests to tests/autoschedulers. Drive-by fix to the Python test.	06 October 2022, 18:11:25 UTC
9de4906	Steven Johnson	06 October 2022, 18:09:38 UTC	Autoscheduler test reorg, part 1 (#7064) * Autoscheduler test reorg, part 1 The end goal here is to move the tests for all autoschedulers into `test/autoschedulers/<name>`. This part handles the tests for the Mullapudi2016 autoscheduler: - Moves from test/auto_schedule - Silences a lot of the irrelevant noise that the tests emit to stdout (see also #7063) - Ensure that the tests that run manual-and-auto tests actually check the times for plausibility (where "plausible" means "no worse than the current status quo")	06 October 2022, 18:09:38 UTC
f360427	Steven Johnson	06 October 2022, 17:21:28 UTC	Improve MSAN under JIT (#7059) Running JIT code under MSAN is (still) not officially supported, as it can (and does) give false positives; that said, this PR reduces the false positives in some cases. Specifically: (1) if we are building a JIT shared runtime (ie, we are jitting), and we are compiling with MSAN enabled (detected via preprocessor), we should always use the real MSAN stubs, even if the Target::MSan feature has been cleared (because JITModule::make_module() clears it). (2) Target::get_jit_from_environment() should add any detected sanitizer bits to the target sniffed from HL_JIT_TARGET	06 October 2022, 17:21:28 UTC
e3bbafd	Steven Johnson	06 October 2022, 17:19:23 UTC	Add a terminate_handler to try to report unhandled exceptions (#7038) * Add a terminate_handler to try to report unhandled exceptions * Move terminate handler to an object file, applied to tests * Update terminate_handler.cpp * Update terminate_handler.cpp * Update HalideTestHelpers.cmake	06 October 2022, 17:19:23 UTC
16243da	Steve Suzuki	04 October 2022, 21:25:16 UTC	Add support for float16 buffer in python extension (#7060)	04 October 2022, 21:25:16 UTC
5822458	Steven Johnson	04 October 2022, 20:24:49 UTC	Upgrade wabt to 1.0.30 (#7058)	04 October 2022, 20:24:49 UTC
236521b	Steven Johnson	28 September 2022, 16:57:49 UTC	Allow redefinition of Generators when in interactive mode (#7053) It's really annoying to be using (e.g.) Colab/Jupyter/etc and get errors about "Generator already defined", so sniff for interactive mode and allow re-registration of the same name in that case.	28 September 2022, 16:57:49 UTC
820ec1f	Steven Johnson	27 September 2022, 16:34:12 UTC	Don't mutate GeneratorParams in PythonGenerators (#7052) * Don't mutate GeneratorParams in PythonGenerators They are really only ever defined at the class level, so if you mutate the value, you are mutating the default value for all future instances. Subtle bug. * Update _generator_helpers.py * Update _generator_helpers.py	27 September 2022, 16:34:12 UTC
d5a8118	Andrew Adams	26 September 2022, 21:47:13 UTC	Make Halide::round behave as documented (#7012) * Clean up some pointless code * Improve comment on Halide::round * Make Halide::round round to even as documented * Explicitly set the rounding mode in the C backend * Use rint on ptx, which is documented to round to even * round to even on win32 * the nvidia libdevice is buggy for doubles See https://reviews.llvm.org/D85236 * Add missing include to C output * Fix rounding in opencl * Don't test opencl with doubles if CLDoubles is not enabled * Work around hexagon issue * Don't try to emit roundeven on wasm * wasm doesn't support float16 * Add vectorizable lowering for round on platforms without roundeven * Use rint on metal for Halide::round * Make round an intrinsic * Constant-fold round in simplifier * d3d12 fix * Bounds of Call::round * Teach the mullapudi cost model about round * Handle PureIntrinsics of const args in bounds * scatter, undef, and require aren't pure * metal doesn't support doubles * More parens * Add missing return * Add vector versions of rint for wasm * Use nearbyint for wasm instead of rint * revert change to mangling * d3d12 doesn't like double input/output buffers * Lower round on arm-32 not-linux * Don't simplify lowering of round-to-nearest-ties-to-even in codegen * Fix infinite loop in round lowering on arm-32-notlinux * Take care to never revisit args in bounds call visitor * Remove defunct comment Co-authored-by: Steven Johnson <srj@google.com>	26 September 2022, 21:47:13 UTC
59353ab	Zalman Stern	26 September 2022, 19:37:26 UTC	Allow CodeGen_LLVM::codegen_buffer_pointer to support vectors. (#7049) This is useful for scatter/gather in the vector predication intrinsics work.	26 September 2022, 19:37:26 UTC
a817414	Zalman Stern	24 September 2022, 00:16:33 UTC	Allow call_intrin to call an LLVM intrinsic with void return type. (#7048) Have ```CodeGen_LLVM::get_vector_type``` return the void type if passed the void type as a scalar base. This makes it possible to call intrinsics returning void via ```CodeGenn_LLVM::call_intrin```. Should be very safe as currently this anything doing this would fail inside the routine. The change breaks the invariant that any thing returned from get_vector_type is a vector type, but propagating void for function return types is a pretty standard behavior and if this was not intended, it will very likely fail just outside this instead of having failed inside the use. I.e. very low chance of spurious errors from this.	24 September 2022, 00:16:33 UTC
6499ad1	Zalman Stern	23 September 2022, 22:58:27 UTC	Fix false positive use after free warning. (#7046) Latest clang is giving a use after free warning for printing a pointer value after it is freed. The compiler is not exactly wrong in that a pointer is being passed to something after it is freed, but the something is just printing the pointer's value for debugging purposes. Cast through an intptr_t to avoid the warning.	23 September 2022, 22:58:27 UTC
666b87d	Steven Johnson	23 September 2022, 18:15:13 UTC	add_requirement() maintenance (#7045) * add_requirement() maintenance This PR started out as a quick fix to add Python bindings for the `add_requirements` methods on Pipeline and Generator (which were missing), but expanded a bit to fix other issues as well: - The implementation of `Generator::add_requirement` was subtly wrong, in that it only worked if you called the method after everything else in your `generate()` method. Now we accumulate requirements and insert them at the end, so you can call the method anywhere. - We had C++ methods that took both an explicit `vector<Expr>` and also a variadic-template version, but the former required a mutable vector... and fixing this to not require that ended up creating ambiguity about which overloaded call to use. Added an ugly enable_if thing to resolve this. (Side note #1: overloading methods to have both templated and non-templated versions with the same name is probably something to avoid in the future.) (Side note #2: we should probably thing more carefully about using variadic templates in our public API in the future; we currently use it pretty heavily, but it tends to be messy and hard to reason about IMHO.) * tidy * remove underscores	23 September 2022, 18:15:13 UTC
7286ec3	Steven Johnson	22 September 2022, 17:07:32 UTC	Fix PyExt error handling (#7042) The current PythonExtensionGen code attempts to provide verbose error (exception) messages by overriding halide_error and saving the message in a thread_local. This isn't safe or correct, however, and (in general) is wrong for any Halide code using multiple threads. #6994 proposes ways to mitigate this (and there are experiments in place to implement it), but unless/until those enhancements land, we can't leave the code in its current state. So: - Don't try to save the text at all. - Optionally log the text to stderr. - Just throw an exception with the numeric error code. This is suboptimal, but better than the existing usually-incorrect-message behavior. - Bonus: wrap both the error and print overloads with `PyGILState_Ensure()`, as we are supposed to, to ensure we don't die.	22 September 2022, 17:07:32 UTC
b4b27b2	Steven Johnson	21 September 2022, 23:20:06 UTC	Revert "Temporarily disable testing for apps/fft (#7033)" (#7040) Revert "Temporarily disable testing for apps/fft (#7033) (#7035)" This reverts commit 48d56d8066f322e016d60d486613837e3670dd00.	21 September 2022, 23:20:06 UTC
33f6a0f	Volodymyr Kysenko	21 September 2022, 22:01:10 UTC	Handle widen_right_* intrinsics in bounds inference (#7039)	21 September 2022, 22:01:10 UTC
7ec2a4e	Steven Johnson	21 September 2022, 17:20:03 UTC	Add stack-size-canary test to apps/fft's CMake file (#7034) * Add stack-size-canary test to apps/fft's CMake file This was apparently meant as a canary for stack size usage, but the necessary setting only happened in the Makefile, not the CMake file. Also, drive-by fix in Makefile to ignore warnings about `-ObjC++` being ignored, which apparently can be the case with current AppleClang configs. * Update CMakeLists.txt	21 September 2022, 17:20:03 UTC
0d02a0b	Steven Johnson	21 September 2022, 00:55:56 UTC	Fix Wasm BulkMemory Codgen + Minor fixes to apps/HelloWasm (#7026) * Minor fixes to apps/HelloWasm * trigger buildbots * Fix Codegen * Update CMakeLists.txt	21 September 2022, 00:55:56 UTC
7351070	Steven Johnson	20 September 2022, 22:52:26 UTC	Codegen_C for user_context (#7031) Codegen_C fixes	20 September 2022, 22:52:26 UTC
ba53b93	Alexander Root	20 September 2022, 22:00:14 UTC	Add reinterpret simplifications (#7029) Co-authored-by: Steven Johnson <srj@google.com>	20 September 2022, 22:00:14 UTC
48d56d8	Steven Johnson	20 September 2022, 19:08:29 UTC	Temporarily disable testing for apps/fft (#7033) (#7035) Want to avoid reporting this known bug while fix is investigated	20 September 2022, 19:08:29 UTC
36d601f	Steven Johnson	19 September 2022, 23:18:19 UTC	Don't use `-g` for EMCC (#7025) * Don't use `-g` for EMCC Combining `-g` with other default EMCC flags will now emit warning/error messages regarding binaryen optimization, so, don't use that flag in the default settings. * Update Error.cpp	19 September 2022, 23:18:19 UTC
3a5941d	Steven Johnson	16 September 2022, 23:38:19 UTC	Appease Python linter (#7022) Apparently it's "more Pythonic" to use "not in" vs "not ... in", etc.	16 September 2022, 23:38:19 UTC
ef10a42	Steven Johnson	16 September 2022, 21:06:17 UTC	Define a Generator framework in Python (#6764) * Define a Generator framework in Python	16 September 2022, 21:06:17 UTC
5ff15bb	Alexander Root	15 September 2022, 01:54:29 UTC	Fix SpecificExpr canonicalization (#7016) fix SpecificExpr canonicalization	15 September 2022, 01:54:29 UTC
18b06f0	Steven Johnson	14 September 2022, 20:49:20 UTC	Revert "[HVX] Simplify constant factor before distributing" (#7013) Revert "[HVX] Simplify constant factor before distributing (#7009)" This reverts commit 69b50af793d6eb850eea3dccb16426479edbff9d.	14 September 2022, 20:49:20 UTC
ace8028	Varun Sharma	14 September 2022, 14:04:29 UTC	Add minimum GitHub token permissions for workflow (#7011) Signed-off-by: Varun Sharma <varunsh@stepsecurity.io> Signed-off-by: Varun Sharma <varunsh@stepsecurity.io>	14 September 2022, 14:04:29 UTC
655211e	Steven Johnson	14 September 2022, 02:05:02 UTC	Rework Python Extension C++ code (again) (#7010) * Rework Python Extension C++ code (again) My previous effort was too clever for itself: while it worked for Halide's build systems, some other build systems (e.g. Blaze) are much more finicky about the C++ files you build Python extensions from, and re-using the same C++ files with different preprocessor settings turns out to be too problematic there, for reasons that aren't important here. Anyway, the important part here was to rework so that (1) All the C++ source files needed are compiled exactly once (2) All the C++ source files needed can be compiled with the same set of preprocessor definitions To that end, I have extended GenGen's `-r` flag to allow using `-e python_extension`; this emits the bare module-registration code by itself. So now, we generate the Python Extension code as before, but define HALIDE_PYTHON_EXTENSION_OMIT_MODULE_DEFINITION to defeat the standalone module registration for each one, then also compile in the new 'standalone' registration, with HALIDE_PYTHON_EXTENSION_MODULE and HALIDE_PYTHON_EXTENSION_FUNCTIONS defined to fill in the blanks. Also, a little drive-by cleanup in CodeGen_C to make extern "C" blocks more findable, and some restructuring in PyExtGen. * Update user_context_generator.cpp * Dummy source file * IF NOT EXISTS before file(WRITE)	14 September 2022, 02:05:02 UTC
27b8a7d	Alexander Root	13 September 2022, 18:14:42 UTC	Add one-sided widening intrinsics. (#6967) * implement widen_right_ ops * update HVX patterns with one-sided widening intrinsics * remove unused HVX pattern flags * strengthen logic for finding rounding shifts Co-authored-by: Steven Johnson <srj@google.com>	13 September 2022, 18:14:42 UTC
69b50af	Alexander Root	12 September 2022, 23:51:33 UTC	[HVX] Simplify constant factor before distributing (#7009) * simplify constant factor before distributing * add simd_op_check test	12 September 2022, 23:51:33 UTC
ff47ab0	Andrew Adams	12 September 2022, 16:25:26 UTC	Fix some bugs in div_round_to_zero (#7008) * Fix some bugs in div_round_to_zero ... and fast_integer_divide_round_to_zero These were never adequately tested, and there were a few issues. * Add missing print	12 September 2022, 16:25:26 UTC
a4f86de	Steven Johnson	11 September 2022, 23:57:33 UTC	Fix Python handling of boolean buffers (#7006) The Python Extension code didn't handle boolean buffers correctly, making it impossible to construct one in Python and pass it thru to Halide-generated code. This fixes that, and also fixes the test that just expected it to fail (!).	11 September 2022, 23:57:33 UTC
c98f193	Zalman Stern	10 September 2022, 16:02:48 UTC	Couple small fixes to update RISC V to current LLVM flags and enable vscale use. (#6995) Couple small fixes to update RISC V to current LLVM flags and enable vscale use. Co-authored-by: Steven Johnson <srj@google.com>	10 September 2022, 16:02:48 UTC
a0a1d09	Steven Johnson	10 September 2022, 00:07:12 UTC	Prohibit C99 VLA usage in runtime code (#7005) * Prohibit C99 VLA usage in runtime code AFAICT we aren't doing this in Halide at present, but some experimental code in Google runtime was doing so; this caused some issues with some experimental Clang patches, but also was never really intended to be used in the first place. Adding the flag here to be sure no unintended use creeps back in. While I was there, took the time to ensure that the flags for runtime are unified across CMake and Make. * oops	10 September 2022, 00:07:12 UTC
4e352d3	Steven Johnson	09 September 2022, 16:49:26 UTC	Clean up Adams2019 CMake file (#7003)	09 September 2022, 16:49:26 UTC
bd74f94	Steven Johnson	08 September 2022, 00:40:05 UTC	Log target info in performance_fast_pow (#6997) (#6998) Try to gather info to track down heisenbug	08 September 2022, 00:40:05 UTC
e5069ef	Steven Johnson	07 September 2022, 21:15:58 UTC	Apply _Halide_place_dll() to _Halide_gengen (#6999) (#7000)	07 September 2022, 21:15:58 UTC
1644e64	Fangrui Song	07 September 2022, 18:28:08 UTC	[Codegen] Adapt ModuleAddressSanitizerPass/ModuleSanitizerCoveragePass renaming (#6996) https://github.com/llvm/llvm-project/commit/93600eb50ceeec83c488ded24fa0fd25f997fec6 renamed ModuleAddressSanitizerPass to AddressSanitizerPass. https://github.com/llvm/llvm-project/commit/4c18670776cd6ac31099a455b2b22b38b0408006 renamed ModuleSanitizerCoveragePass.	07 September 2022, 18:28:08 UTC
cbe2e63	Steven Johnson	06 September 2022, 17:18:50 UTC	Fix compiler warnings in Elf.cpp (#6992) * Fix compiler warnings in Elf.cpp Some versions of GCC will complain that there is a possible use of uninitialized field `Sym<>::st_info` here; that's technically true, in that it is a bitfield that we previously set via two calls, so it temporarily could use uninitialized bits, but those would immediately be overwritten by well-defined bits. That said, the API could have been misused, so I collapsed Sym::set_type and Sym::set_bindings into a single call to avoid this warning. While I was there, I did a little hygiene on Rel<> and Rela<> as well, as there was an unused-but-similarly-dubious API there. Also added some C++17 `if constexpr` love. * Removed constexpr	06 September 2022, 17:18:50 UTC
8b9c081	Alex Reinking	02 September 2022, 01:55:43 UTC	Fixes for Xcode "new" build system. (#6993) 1. TargetExportScript was running into an Xcode bug with its handling of linker flags. Now using XCODE_ATTRIBUTE_EXPORTED_SYMBOLS_LIST as a workaround. 2. Added a missing dependency in Python module definition code. Fixes #6987	02 September 2022, 01:55:43 UTC
ce2e7f3	Steven Johnson	01 September 2022, 17:58:57 UTC	Refactor buffer-unpacking code in PythonExtensionGen (#6991) This moves most of the interesting code into the common module block, so we don't risk duplicating code for extensions that contain multiple function definitions.	01 September 2022, 17:58:57 UTC
95e37ee	Steven Johnson	31 August 2022, 21:44:17 UTC	Improve error-handling in Python Extensions (#6986) * Improve error-handling in Python Extensions Currently, Python Extensions don't make any effort to override `halide_error`, so the default (which aborts) is generally used... this is very unfriendly. This modifies the standard Python Extension glue code to hook halide_error, saving the text in a thread local, and then throwing a Python exception after the extension's AOT call is finished (if an error occurred, of course). Also does a drive-by default hooking of `halide_print` to ensure that it goes to whatever Python thinks that `stdout` is. (Note that it would be really nice if we could use closures of some sort for halide_error, halide_print, etc so that we could save context in the actual Python module, rather than in a thread-local global var, but this currently isn't possible without nontrivial refactoring in the Halide runtime.) * Make Windows happy * Remove dangling code bits * Allow defeating of error-handler via HALIDE_PYTHON_EXTENSION_OMIT_ERROR_AND_PRINT_HANDLERS	31 August 2022, 21:44:17 UTC
e531e24	Alex Reinking	31 August 2022, 16:55:51 UTC	Fix markdown links (#6988)	31 August 2022, 16:55:51 UTC
386e1ee	Steven Johnson	30 August 2022, 20:19:58 UTC	Add `add_halide_runtime` rule (#6985) Fixes #6981	30 August 2022, 20:19:58 UTC
e7c1c86	Steven Johnson	29 August 2022, 23:09:20 UTC	Add test for _Halide_target_export_single_symbol (#6983) Add test for _Halide_target_export_single_symbol	29 August 2022, 23:09:20 UTC
c24b406	Steven Johnson	29 August 2022, 20:48:23 UTC	Add add_halide_python_extension_library() rule (#6979) * Add add_halide_python_extension_library() rule This adds a rule to create a single Python extension library from one (or more) halide_library rules. This allows you to package multiple Halide filters into a single Python module, which is nice because (1) being able to organize is good, and (2) all the filters in a single Python extension module share the same Halide runtime, including (e.g.) thread pools and method overrides. (It also removes the just-recently-added PYTHON_EXTENSION_LIBRARY option from the add_halide_library rule, as this new rule is better and more flexible in pretty much every way.) This modifies the content of our `python_extension` output in such a way that existing uses should be completely unaffected, but defining the right preprocessor macros allows us to split the function wrappers up from the method-definition declaration, so we don't have to generate any new code artifiacts to make this work. Partially addresses #6956. * Omits -D in target_compile_definitions * be explicit about setting to empty * Add quotes * Add comments re BUILD_INTERFACE * Add MODULE_NAME comment * Remove "defined in HalideGeneratorHelpers.cmake" * Add comment re add_halide_runtime() * osx, macos, darwin, oh m * blankity blank blank * Use OBJECT library instead * Add comment about X-macros * Update HalideGeneratorHelpers.cmake	29 August 2022, 20:48:23 UTC
5a09dda	Alex Reinking	26 August 2022, 00:42:26 UTC	Fix XCode by wrapping weights in an OBJECT library (#6977) The XCode "new build system" doesn't like generated source files to be associated with more than one target. Going through an OBJECT library like this fixes that problem, but also saves us a compilation, so it's a good thing to do anyway. Fixes #6976	26 August 2022, 00:42:26 UTC
50018c4	Zalman Stern	25 August 2022, 23:01:39 UTC	Small refactor to remove confusion between CodeGen_LLVM and CodeGen_Internal. (#6973) * Small refactor to remove confusion between CodeGen_LLVM and CodeGen_Internal.	25 August 2022, 23:01:39 UTC
4877b9f	Alexander Root	24 August 2022, 16:21:24 UTC	Lower saturating_cast in bounds inference (#6970) * lower saturating_cast in bounds inference * openGL fix to saturating_cast	24 August 2022, 16:21:24 UTC

Newer
Older