Revision history - refs/tags/v12.0.0 - origin: https://github.com/halide/Halide

visit type:

Revision	Author	Date	Message	Commit Date
b5a34c3	Alex Reinking	19 May 2021, 20:47:20 UTC	Update README for Halide 12 release. (#6034)	19 May 2021, 20:47:20 UTC
1c0ff0f	Alex Reinking	19 May 2021, 20:11:38 UTC	Fix Windows ZIP package script. (#6035)	19 May 2021, 20:11:38 UTC
dfe0f97	Steven Johnson	19 May 2021, 19:57:54 UTC	Enable a wasm-simd op in simd_op_check that is now generated in LLVM13 (#6024)	19 May 2021, 19:57:54 UTC
6a1e529	Steven Johnson	19 May 2021, 17:21:37 UTC	Remove duplicate -e argument in bilateral_grid (#6008) (#6033)	19 May 2021, 17:21:37 UTC
5bd5a04	Dillon Sharlet	19 May 2021, 05:17:07 UTC	Also invaldiate alignment if the type can't represent it. (#6032)	19 May 2021, 05:17:07 UTC
cdc0223	Dillon Sharlet	19 May 2021, 00:46:20 UTC	More simplifier rules (#6017) * More simplifier rules. * More simplifier rules. * More variations of these rules * Remove rules that try to pull negative out of multiply, add quantized ramp rules * Add is_const(x, value) predicate * These might be useful too.	19 May 2021, 00:46:20 UTC
622164e	Dillon Sharlet	18 May 2021, 17:53:16 UTC	Various simplifier improvements (#5993) * Redo hoisting if statements * Track bounds through casts (fixes #5905). * Improve and add rules to simplify TailStrategy::Predicate/TailStrategy::GuardWithIf * Replace out of bounds loads/stores with undef. * Fix min * Replace rules with generated ones. * Replace rules with synthesized rules * Remove unnecessary predicates. * Fix out of bounds load/store removal for loads/stores that are a different type than the allocation * Fix no-op else cases. * Update test for new behavior. * Use unreachable instead of undef for out of bounds loads/stores * Update IROperator.h * Fix unsafe evaluate tests * Learn from (x * a) / b == c * Don't let initial leaves depend on the variable itself. * Remove sketchy rules, only learn from constants. * clang-format * Support any constant equation when learning facts. * clang-format * trigger buildbots * Don't need this with abadams/dont_substitute_complex_constraints * Don't treat unreachable as pure * Don't track possibly overflowing min/max * Remove unreachable in the simplifier. * More aggressive removal of unreachable for ifs * Remove for loops with unreachable bodies * Remove bad rules. * Fix both branches unreachable. * Also make adjacent code to unreachable loops unreachable. Co-authored-by: Steven Johnson <srj@google.com>	18 May 2021, 17:53:16 UTC
253e93d	Steven Johnson	18 May 2021, 16:37:18 UTC	Add an ErrorReporter hook to TfLiteModelRunner (#6021) This allows us to silence some (but not all) of the noise that TfLite logs when we disable our own verbosity.	18 May 2021, 16:37:18 UTC
15f51f3	Steven Johnson	18 May 2021, 01:03:14 UTC	Fix apps/hannk configure_cmake.sh script (#6018) * Fix apps/hannk configure_cmake.sh script * trigger buildbots	18 May 2021, 01:03:14 UTC
b45caa8	Alex Reinking	17 May 2021, 23:48:48 UTC	Add Ubuntu packaging scripts and GHA testing (#5754) * Fix CMake & packaging glitches for Ubuntu package. * Add Ubuntu packaging scripts and presets. * Add GHA workflow to test packaging and usage on Ubuntu * Address review comments.	17 May 2021, 23:48:48 UTC
e711235	Dillon Sharlet	17 May 2021, 22:48:10 UTC	Small HVX fixes (#5990) * Small HVX fixes * Simpler align_up implementation. * Simpler fix for small vector loads. * Fix non-native deinterleaving of arguments to patterns * Tweak register count for Hexagon. * I don't know what happened but this works now. * Bits and bytes r hard * Avoid bias wrapper when not needed * I think these deinterleaves are safe. * Align the input of depthwise with depth multiplier 1. * Revert bad merge due to weird GitHub UI Co-authored-by: Steven Johnson <srj@google.com>	17 May 2021, 22:48:10 UTC
ed6eccc	Alex Reinking	17 May 2021, 16:12:30 UTC	fix package.sh perms (#6016)	17 May 2021, 16:12:30 UTC
adb1e05	Andrew Adams	16 May 2021, 22:01:00 UTC	Fix local laplacian upsample (#6011)	16 May 2021, 22:01:00 UTC
deea5ec	Andrew Adams	15 May 2021, 22:50:08 UTC	Substituting complex expressions for constrained scalar inputs makes … (#6014) * Substituting complex expressions for constrained scalar inputs makes a mess Substituting in constants and variables is probably fine. * Remove extra loop Co-authored-by: Dillon Sharlet <dsharlet@google.com>	15 May 2021, 22:50:08 UTC
50d7640	Andrew Adams	15 May 2021, 21:55:49 UTC	Permit "safe" parallel scatters, even when they race (#4841) lets the atomic() scheduling directive also apply to simple assignments in addition to associative commutative operators, e.g. hist(f(r.x), x) = g(x) is safe to parallelize over r.x if the stores are atomic, because the RHS doesn't depend on the hist or r.x	15 May 2021, 21:55:49 UTC
878c3ec	Volodymyr Kysenko	15 May 2021, 16:14:45 UTC	Fix CodeGen_C::print_scalarized_expr (#6006) * Fix CodeGen_C::print_scalarized_expr * CppVector/NativeVector object doesn't have .replace() anymore. * Initialize vector with zero to avoid warning. * Actually, can't assign to CppVector (only to NativeVector), so do ::broadcast instead * Leave it uninitialized	15 May 2021, 16:14:45 UTC
b829d12	Volodymyr Kysenko	15 May 2021, 16:13:54 UTC	Support Shuffle::extract_element from list of scalars in CodeGen_C (#6007)	15 May 2021, 16:13:54 UTC
69eed6e	sksarda	14 May 2021, 16:51:32 UTC	Add -fpic option to debug version on non-windows .ll file (#6000) Else, it eliminates reference to global offset table for symbols to be resolved from remote library causing runtime crash with -debug option. Co-authored-by: Suyog Sarda <ssarda@codeaurora.org>	14 May 2021, 16:51:32 UTC
e0ac07b	Steven Johnson	14 May 2021, 16:49:01 UTC	Upgrade hannk TFLite version to 2.5.0 (#6009) https://github.com/tensorflow/tensorflow/releases/tag/v2.5.0	14 May 2021, 16:49:01 UTC
75a079a	Alex Reinking	08 April 2021, 00:17:40 UTC	Use presets in zip/package.bat	13 May 2021, 18:36:59 UTC
3f3ce62	Alex Reinking	18 February 2021, 21:44:50 UTC	Use presets in tgz/package.sh	13 May 2021, 18:36:59 UTC
a81b86b	Alex Reinking	07 April 2021, 23:17:26 UTC	Move packaging scripts from tools/ to packaging/<type>/	13 May 2021, 18:36:59 UTC
bf32585	Alex Reinking	18 February 2021, 21:41:59 UTC	Move packaging support files into common directory.	13 May 2021, 18:36:59 UTC
f0db1c6	Alex Reinking	14 April 2021, 06:51:16 UTC	Split Halide CMake helpers into separate package.	13 May 2021, 18:36:59 UTC
02a8f87	Alex Reinking	12 March 2021, 11:27:27 UTC	Remove same-directory shared/static mixing	13 May 2021, 18:36:59 UTC
5cdbcb0	Alex Reinking	18 February 2021, 21:40:19 UTC	Re-work packaging to support complex formats (like DEB)	13 May 2021, 18:36:59 UTC
568d18c	Alex Reinking	12 May 2021, 22:18:35 UTC	Fix dependencies in FFT app.	13 May 2021, 18:36:59 UTC
c61b8cb	Alexander Root	13 May 2021, 17:42:34 UTC	Guard against overflow in constant folding for EQ rewrite rules (Fixes #5998) (#6002) * guard against constant folding overflow in EQ rewrite rules * trigger buildbots Co-authored-by: Steven Johnson <srj@google.com>	13 May 2021, 17:42:34 UTC
b2947e9	Alex Reinking	12 May 2021, 20:55:49 UTC	Fix Windows apps (#5999) * Place DLLs on Windows by copying. * Disable Hannk on Windows by default	12 May 2021, 20:55:49 UTC
6bb87cf	Andrew Adams	12 May 2021, 16:07:25 UTC	Stop interleaving stores from generating too-large vectors (#5996) * Stop interleaving stores from generating too-large vectors * Remove integer constant * Use mul_would_overflow helper instead	12 May 2021, 16:07:25 UTC
33308d9	Dillon Sharlet	12 May 2021, 16:06:49 UTC	Add pmaddubsw support (#5997) * Add pmaddubsw support * Move pmaddubsw checks to ssse3 * These patterns rae a bit finnicky	12 May 2021, 16:06:49 UTC
3dce2d5	Dillon Sharlet	11 May 2021, 19:18:20 UTC	Small H::R::B cleanups and improvements (#5957) * Reuse helpers from halide_buffer_t * Combine decref and decrev_dev to hopeuflly reduce overhead. * Remove redundant public * This old logic was necessary Co-authored-by: Steven Johnson <srj@google.com>	11 May 2021, 19:18:20 UTC
257b2f5	Dillon Sharlet	11 May 2021, 00:10:01 UTC	Small performance portabiilty tweaks (#5989) Co-authored-by: Steven Johnson <srj@google.com>	11 May 2021, 00:10:01 UTC
6b2732a	Dillon Sharlet	10 May 2021, 22:51:37 UTC	Fix build with asserts enabled (#5987) * Minor cleanups after #5983 * Work around linker breakage!? * This doesn't need to be a constant. * Mark power_of_two constructor explicit Co-authored-by: Steven Johnson <srj@google.com>	10 May 2021, 22:51:37 UTC
fcf9046	Dillon Sharlet	10 May 2021, 21:47:22 UTC	Fix specializing on stride issue (fixes #5907) (#5950) * Fix specializing on stride issue (fixes #5907) * Remove stale comment * Add broadcasting test to CMakeLists.txt * remove_dead_lets -> remove_dead_code * Add test for specializing only on stride. * Remove broadcasting performance test. * Also remove from CMake	10 May 2021, 21:47:22 UTC
6e23346	Steven Johnson	10 May 2021, 20:12:06 UTC	Refactor hannk's compare_vs_tflite code to be mostly library (#5991) * Refactor compare_vs_tflite into library+shell Also, drive-by change to the test names to keep them matching filenames more closely * wip * Update compare_vs_tflite.cpp * wip * Fixes * clang-format * Fix Makefile * trigger buildbots	10 May 2021, 20:12:06 UTC
e33438a	Dillon Sharlet	10 May 2021, 19:23:11 UTC	Revert "Stack input and filter to reduce generated code in FFT app (#5985)" (#5992) This reverts commit d2539287fe4c0c51128a78dc51c2c6d1812cd694.	10 May 2021, 19:23:11 UTC
d253928	Dillon Sharlet	10 May 2021, 17:57:07 UTC	Stack input and filter to reduce generated code in FFT app (#5985) * Stack input and filter to reduce generated code. * Change comments.	10 May 2021, 17:57:07 UTC
9eeade3	Steven Johnson	10 May 2021, 17:24:10 UTC	Rename CHECK->HCHECK, LOG->HLOG in hannk (#5986) Quick-n-dirty rename to avoid conflicts with Abseil/google3. Longer term fix will be forthcoming.	10 May 2021, 17:24:10 UTC
2e47968	Steven Johnson	10 May 2021, 17:23:24 UTC	Fix for upstream LLVM (#5988) * Fix for upstream LLVM * Fixes	10 May 2021, 17:23:24 UTC
5550f96	Dillon Sharlet	07 May 2021, 22:44:40 UTC	Don't hardcode depthwise padding. (#5984)	07 May 2021, 22:44:40 UTC
e0b7d8a	Dillon Sharlet	07 May 2021, 22:44:03 UTC	Refactor quantized multiplications (#5983) * Refactor quantized multiplications * Move comment. * clang-format * base -> mantissa	07 May 2021, 22:44:03 UTC
e980b27	Steven Johnson	07 May 2021, 18:43:54 UTC	advance_ptrs() should use refs, not ptrs (#5981) * advance_ptrs() should use refs, not ptrs Examination of compiled output (x86-64, clang w/ optimizer) shows slightly better codegen. * Update HalideBuffer.h	07 May 2021, 18:43:54 UTC
aac383f	Steven Johnson	07 May 2021, 18:32:39 UTC	Add dynamically-typed scalar inputs to Generator (#5953) (#5965) * Add dynamically-typed scalar inputs to Generator (#5953) * Update stubuser_generator.cpp * clang-format	07 May 2021, 18:32:39 UTC
aba3a80	Andrew Adams	07 May 2021, 02:47:08 UTC	Use a VectorReduce not to determine if any lanes are true in Hexagon backend (#5978)	07 May 2021, 02:47:08 UTC
9f7a459	Steven Johnson	06 May 2021, 21:57:47 UTC	Add missing #include in buffer_util.h (#5979) * Add missing #include in buffer_util.h * Update buffer_util.h	06 May 2021, 21:57:47 UTC
3f799ff	Dillon Sharlet	06 May 2021, 21:35:55 UTC	Optimize copy_from a little (#5977)	06 May 2021, 21:35:55 UTC
3f83f5a	Dillon Sharlet	06 May 2021, 20:25:57 UTC	Add transpose op (#5968) * Add transpose op. * Fix type requirements * Add tests for some misc ops. * Fix type checks	06 May 2021, 20:25:57 UTC
93c878e	Dillon Sharlet	06 May 2021, 20:00:46 UTC	Optimize add generator (#5972) * Optimize add generator. * Update benchmarks * Vectorize wider for more ILP * Better add implementation. * More tweaks. ARM is really sensitive to exactly how these shifts are done. * More performance portable implementation of add. * Put signs back * Add comment.	06 May 2021, 20:00:46 UTC
42b5f79	Dillon Sharlet	06 May 2021, 19:54:35 UTC	Optimize fully connected when there are more than 4 batches (#5969) * Optimize fully connected when there are more than 4 batches * Fix crazy working typo * Do batches inside channels. * Fix missed constant	06 May 2021, 19:54:35 UTC
2ebaedd	Alex Reinking	06 May 2021, 17:06:32 UTC	Don't add Halide DLL to PATH on Windows. (#5973) This conflicts with vcpkg's own binary copying on Windows and makes cross compiling more difficult. It also runs into issues with excessively long commands when the user's PATH variable is very long.	06 May 2021, 17:06:32 UTC
ed989db	Dillon Sharlet	06 May 2021, 16:52:19 UTC	Remove some more old codegen workarounds and cleanups (#5932) * Remove old codegen workarounds * Pre-AVX2 codegen still needs this :( * trigger buildbots Co-authored-by: Steven Johnson <srj@google.com>	06 May 2021, 16:52:19 UTC
813180b	Steven Johnson	06 May 2021, 16:38:22 UTC	HalideBuffer should use D=max_rank instead of D=4 (#5971) This prevents mallocs for some degenerate cases where we need buffers with > 4 dimensions.	06 May 2021, 16:38:22 UTC
ab3670d	Alex Reinking	16 March 2021, 11:37:38 UTC	Clean up CMake helpers. 1. Add HEADER output to add_halide_library. 2. Use $<BUILD_INTERFACE:...> in generated target include paths. 3. Clean up logic (reduce nesting). 4. Use lower-case names for local variables. 5. Print paths to detected Clang and LLD config scripts. 6. Honor normal variable overrides for Halide_TARGET.	06 May 2021, 05:52:34 UTC
0b77168	Alex Reinking	08 April 2021, 18:20:37 UTC	Consistently use Halide_* prefixes in CMake.	06 May 2021, 05:52:34 UTC
0c0b117	Steven Johnson	05 May 2021, 23:41:34 UTC	Upgrade hannk's TFLite version to 2.5.0-rc3 (#5970) * Upgrade hannk's TFLite version to 2.5.0-rc3 * Drive-by cleanup of test names	05 May 2021, 23:41:34 UTC
438ddf8	Dillon Sharlet	05 May 2021, 23:39:44 UTC	Optimize depthwise convolution (#5964) * Try factoring depthwise reduction. * Revert unnecessary change. Co-authored-by: Steven Johnson <srj@google.com>	05 May 2021, 23:39:44 UTC
93dad62	Alex Reinking	05 May 2021, 20:23:56 UTC	Fix tutorial 15 test (#5966)	05 May 2021, 20:23:56 UTC
4a489e6	Steven Johnson	05 May 2021, 19:55:01 UTC	Ensure our local flatbuffers.h is included in preference to system variants (#5962) * Ensure our local flatbuffers.h is included before system variants The local version might be too old for TFLite * Update CMakeLists.txt * Update CMakeLists.txt * Silence the noise noise noise noise NOISE * fix policy	05 May 2021, 19:55:01 UTC
95047ca	Dillon Sharlet	05 May 2021, 16:55:59 UTC	Optimize pooling ops (#5963) * Avoid padding for pool ops. * Use reciprocal to implement division.	05 May 2021, 16:55:59 UTC
115e597	Steven Johnson	05 May 2021, 01:28:21 UTC	Fix FetchContent for hannk (#5960) Apparently using SOURCE_SUBDIR + FetchContent_MakeAvailable() doesn't work on the buildbots. This is equivalent and does work. ¯\_(ツ)_/¯	05 May 2021, 01:28:21 UTC
675748a	Steven Johnson	05 May 2021, 00:02:02 UTC	Add apps/hannk to apps/CMakeLists.txt (#5958) ...but with an option for disabling it, for the buildbots	05 May 2021, 00:02:02 UTC
abefa3c	Dillon Sharlet	04 May 2021, 23:45:21 UTC	Remove nn_ops app in favor of hannk app. (#5893)	04 May 2021, 23:45:21 UTC
7b79dca	Dillon Sharlet	04 May 2021, 23:44:53 UTC	Add HANNK app (#5891) * More accurate approx_log2/exp2. * Add tests from inception_v4 * Improve precision of log2/exp2 related functions. * Add tanh and clean up generators. * Add version-checking to compare_vs_tflite and issue a warning if major and minor versions mismatch * Restore inadvertent @ removal * Add build_hannk/test_hannk targets to Makefile, to make specialized testing on select buildbots easier for now * More hacky padding for depthwise. * Add TODO * trigger buildbots * Add mean op, enable resnet50 to work. * Fix build failure on ARM. * Grammar * Remove stale TODO. * Make tensors shared_ptr. * Fuse double paddings. * Reduce padding for ARM. * Enable DimMap to express alignment. * Remove crops from execute. * Model -> OpGroup refactor. * Add DimMap::align. * Add proper alignment to DimMap. * Recursively transform. * Use cubic polynomials to approximate log2 and exp2. * Add --use_hannk option * Add mul op support. * Add TODO * Add disambiguating parens * Fix boneheaded broadcasting bug. * Less aggressive broadcasting. * Inline basic arithmetic. * Remove unnecessary using directives. * Fix stray unique_ptr<Tensor> * Implement space to depth and depth to space. * Enable scalar boolean comparison ops. * Support ReLUx as unary ops. * CHECK(false) -> LOG(FATAL) * Naming consistency. * More precise mul * Add some easy ops (NEG and SQUARE) * Fix asserts. * Don't segfault if interpreter can't be created * Add comment. * Remove dead file. * Fix excessive precision in softmax. * Lazy-init seeds in compare_vs_tflite, in case use_hannk=0 * Add TODO * Remove scalpel left in patient * Update model.h * Allow broadcasting of c of input 2 * Remove now-pointless specialization helper. * Put the common case specialization first. * Move pooling ops to the same generator file. * Fix softmax correctness issues * Don't benchmark when testing. * Rearrange input parameters. * Remove multiply_quantized helper. * kTfLiteError -> kTfLiteDelegateError * Remove unnecessary check for log2(0) * Fix details of ReshapeOp to match tflite's impl * Add Shape op. * Generically handle elementwise operations of any rank. * Some of these aren't elementwise. * Minor cleanups * Minor cleanup in ReshapeOp::execute() * Remove unused functions * Add Greater, GreaterEqual to delegate * clang-format * Update normalizations_generator.cpp * Avoid horrific clang-format suggestion. * clang-format * Fix common_halide test. * Fix typo. * Fix asserts. * clang-format * Save compare_vs_tflite outputs from first run (not post-benchmark) * Enable approx_exp2 for int16 results without overflow. * Clean up precision of transcendentals * Fix accidental widening of shift by a constant. * Move elementwise generators to the same file. * Report profiler after each test * Optimize fully connected a lot * Add elementwise program interpreter * Add elementwise program interpreter * clang-format * WIP LSTM * Fix Interpreter::inputs and outputs. * Fix some precision and scheduling issues of LSTM * Fix LSTM op * Fix build breakage. * Fix comments. * clang-format * Add wrapper for constructing elementwise programs. * clang-format * Use ElementwiseProgram to implement LstmElementwise * Compress programs and instructions by storing them in int16 and more CISC * Reduce verbose repetitive declarations. * Optimize constant zeros. * Use a named constant for the size of each instruction. * Remove unnecessary const instruction. * Optimize and clean up elementwise programs * Reduce overhead from H::R::B * More H::R::B overhead cleanup. * Add missing include. * Various fixes and improvements. * Add support for LSTM to the hannk delegate (#5943) * Add support for LSTM to the hannk delegate * clang-format * Add support for dynamic tensors to hannk (#5942) * Initial support for Dynamic Tensors in hannk * Update hannk_delegate.cpp * Fixes * Smarten Tensor::resize() * More H::R::B overhead cleanup * Refactor IsNodeSupported * Minor fixes * Fix member name style * Fix is_alias * Add is_no_op for some ops * Log reasons for node rejection if verbosity >= 1 * clang-format * Fix Concat handling for delegate * Properly parse split op * Add SPLIT_V * Fix regression in PadForOps * Add asserts * Revise ReshapeOp to just use a shape tensor * Update ops.cpp * clang-format * Scale tolerance with the data type. * Compress disassembly a bit * Fix bug with aliasing. * Add --use_tflite flag to compare_vs_tflite Allows disabling the reference run, for running just our delegate alone * Clean up hannk makefile * Regularize all of hannk's own include paths to be relative to apps/hannk; this simplifies things and will allow removing some hacks in Blaze/Bazel and also the upcoming CMake support * Remove some gratuitous uses of std::vector. * Remove unused function. * Remove more instances of std::vector * More SmallVector usage. * Avoid vector in can_use_elementwise_program. * Minor drive by fixes. * Remove some more H::R::B copies * Add broadcast support to elementwise programs * Also tweak the generated schema file include path * clang-format * clang-format * Stale TODO * Upgrade hannk to tf2.5 + more (#5948) * Upgrade hannk to tf2.5 + more - upgrade default TFLite to 2.5.0-rc2 - Revise build instructions & assumptions for TFlite (use CMake for it now instead of Bazel) - Revised Android build instructions (now assumes that tflite is built locally rather than pulled from a prebuilt) - Remove the need for flatc/flatbuffers - Minor fixes to the run-on-device scripts * Update Makefile * Update Makefile * Fix some harmless errors related to input slots * TFlite is too sloppy with dimensions. * Intervals are min, max, not min, extent. * Refactor compare_vs_tflite Lots of internal code motion to clean things up and prepare for adding an internal-delegate code path. Immediate change is just `--enable [h][t][x]` instead of the old "use" flags, and reducing the default max-num-of-diffs to 8 instead of 32. * Add an alias for OpPtr * Don't alias inputs that might be used elsewhere. * Use the right tensors when parsing. * Don't schedule sum_filter separately, and avoid 8-bit multiplies for x86 * Improve aliasing logic * Small optimizations. * Add no_bounds_query to elementwise pipelines. * Clean up std:;shared_ptr/H::R::B overhead. * More cvt refactoring, plus clang-format * Fix asserts. * Remove unused trace_ member * Remove unnecessary argument. * Fix x86 * Update compare_vs_tflite.cpp * Add internal-delegate option to CVT * Fix run_compare_on_device for recent changes * Revert possibly broken space to depth optimization. * Add gather and more binary op support. * Update compare_vs_tflite.cpp * Default external-delegate to disabled * clang-format * Better implementation of SpaceToDepth/DepthToSpace. * clang-format * Reduce buffer copy overhead. * Remove unnecessary types. * Consistent multiplication order. * Pad to at least FnRank * clang-format * De-inline two operator<<()'s * compare_vs_tflite error handling - If any of the comparisons fail, exit with a nonzero error code - add `--tolerance` flag to allow tweaking allowable tolerance on a per-pipeline basis * Add CMake build rules to apps/hannk (#5955) * Add CMake build rules to hannk * Update ops.cpp * Fix features * Update CMakeLists.txt * Add tests * Fix cmake issues * Use CMAKE_GENERATOR * Update configure_cmake.sh * Delegate Fixes * Update flag handling in compare_vs_tflite This is gratuitous but was bugging me: - Flags now understand both `--flag value` or `--flag=value` - Unknown flags now fail hard instead of being ignored Co-authored-by: Steven Johnson <srj@google.com>	04 May 2021, 23:44:53 UTC
f45d323	Andrew Adams	04 May 2021, 16:32:56 UTC	Non-widening lowering of rounding shifts (#5956) This version lowers it without needing to widen, which is a large win on x86 for 16 and 32-bit types (3.8x faster and 2.8x faster respectively). It's a very slight slowdown for 8-bit because x86 doesn't have 8-bit shift instructions. Also drive-by typo fix.	04 May 2021, 16:32:56 UTC
94c0eca	Dillon Sharlet	04 May 2021, 00:17:39 UTC	Use dot products for sums. (#5954)	04 May 2021, 00:17:39 UTC
5a0d1e5	Volodymyr Kysenko	03 May 2021, 16:34:37 UTC	Support VectorReduce in CodeGen_C (#5952)	03 May 2021, 16:34:37 UTC
8b9deea	Dillon Sharlet	30 April 2021, 20:56:30 UTC	Fix bugs when D != 4 (#5951) * Fix bugs when D != 4 * clang-format	30 April 2021, 20:56:30 UTC
093e8df	Fangrui Song	29 April 2021, 22:58:43 UTC	Replace llvm::sys::fs::F_None with llvm::sys::fs::OF_None (#5946) The former is deprecated.	29 April 2021, 22:58:43 UTC
fcbd2ee	Dillon Sharlet	27 April 2021, 23:52:57 UTC	Fix build issue in runtime. (#5944)	27 April 2021, 23:52:57 UTC
a391e9a	AbdouTlili	27 April 2021, 23:13:06 UTC	adding a note in the README.md to use -j option in make --build (#5938) * adding a note in the README.md to use -j option in make --build * wrapped the added section to 80 column	27 April 2021, 23:13:06 UTC
5a69e9f	Dillon Sharlet	26 April 2021, 21:12:53 UTC	Fix flattening of ramps involving 64-bit mins (#5940) * Fix flattening of ramps involving 64-bit mins. * Use make_const instead of cast.	26 April 2021, 21:12:53 UTC
91e42f4	Steven Johnson	26 April 2021, 20:10:21 UTC	Don't use as_const_int() on temporaries (#5939) Sometimes we get lucky and it's still valid, but it's always wrong.	26 April 2021, 20:10:21 UTC
1b3cbcb	aankit-ca	26 April 2021, 17:55:12 UTC	[Hexagon] Try vdelta/vrdelta before vlut for some shuffles. (#5935) The patch tries to generate vdelta/vrdelta instructions for non-ramp shuffles. Eg: shuffle(lut_expr, < 0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15, 16, 18, 19, 20, 21, 22, 23, 24, 25, 27, 28, 29, 30, 31, 32, 33, 34, 36, 37, 38, 39, 40, 41, 42, 43, 45, 46, 47, 48, 49, 50, 51, 52, 54, 55, 56, 57, 58, 59, 60, 61, 63, 64, 65, 66, 67, 68, 69, 70>) can be generated using vrdelta. The patch also fixes a bug where we bitcast vdelta/vrdelta with 16/32 bits elements to wrong type. User would see the below error: llvm-project/llvm/lib/IR/Instructions.cpp:2905: static llvm::CastInst llvm::CastInst::Create(Instruction::CastOps, llvm::Value , llvm::Type , const llvm::Twine &, llvm::Instruction ): Assertion `castIsValid(op, S, Ty) && "Invalid cast!"' failed. Co-authored-by: Ankit Aggarwal <aankit@quicinc.com>	26 April 2021, 17:55:12 UTC
ba89623	Shivam Gupta	23 April 2021, 16:19:40 UTC	Small Typo fix in lesson 06 (#5936) Signed-off-by: xgupta <shivam98.tkg@rediffmail.com>	23 April 2021, 16:19:40 UTC
a407acd	Steven Johnson	22 April 2021, 16:29:01 UTC	Revert "Temporarily disable hanging test (#5925)" (#5933) This reverts commit 62505857694ab8af2a88a22edf291e630c8c0cfd.	22 April 2021, 16:29:01 UTC
fb13fb0	Dillon Sharlet	21 April 2021, 22:10:32 UTC	Add mul_shift_right intrinsic and related improvements (#5916) * Add multiply_quantized intrinsic * clang-format * Fix build on some compilers. * Fix incorrect saturating_pmulhrs * multiply_quantized -> mul_shift_right * Remove workaround and just cast shift amounts. * Fix error message * Fix declaration of mul_shift_right.	21 April 2021, 22:10:32 UTC
6867005	Shoaib Kamil	21 April 2021, 19:06:50 UTC	Suppress Metal unused function warning (#5913) Co-authored-by: Steven Johnson <srj@google.com>	21 April 2021, 19:06:50 UTC
5dd85ae	Andrew Adams	21 April 2021, 16:50:56 UTC	Let the user pass the Func to use to the reduction helpers (#5929) * Let the user pass the Func to use to the reduction helpers * Pass Funcs by const ref	21 April 2021, 16:50:56 UTC
17d4771	Dillon Sharlet	21 April 2021, 16:04:27 UTC	Update test to reflect behavior we expect. (#5928)	21 April 2021, 16:04:27 UTC
087567f	Dillon Sharlet	21 April 2021, 16:04:09 UTC	Remove old codegen. LLVM rewrites this back to a multiply anyways. (#5930)	21 April 2021, 16:04:09 UTC
6250585	Steven Johnson	20 April 2021, 21:23:26 UTC	Temporarily disable hanging test (#5925) * Temporarily disable hanging test LLVM13 is causing vector_reductions to hang (https://reviews.llvm.org/D100099 appears to be the injection point). Disabling this test to unbreak the buildbots. * Update vector_reductions.cpp	20 April 2021, 21:23:26 UTC
c1de142	Alexander Root	20 April 2021, 21:21:33 UTC	[adams2019] Add caching to autoscheduler (#5697) * add feature caching and block caching to adams2019 autoscheduler * added caching verification for feautures * add caching docstrings	20 April 2021, 21:21:33 UTC
ac23987	Dillon Sharlet	20 April 2021, 15:02:14 UTC	Speed up simd_op_check by only compiling one pipeline per op (#5918) * Speed up simd_op_check and compute_with * Dense vector loads can be written many different ways.	20 April 2021, 15:02:14 UTC
6963673	Dillon Sharlet	20 April 2021, 00:24:06 UTC	Add Target::ARMv81a and improve shift instruction selection (#5917) * Add Target::ARMv81a and improve shift instruction selection. * Remove merge mistake. * Don't use ARM intrinsic on arm32, it seems to be missing sometimes.	20 April 2021, 00:24:06 UTC
493dbd4	Steven Johnson	17 April 2021, 17:46:20 UTC	Comment out specialiations for f64x2.convert_low_i32x4_s/u (#5914) LLVM removed the primitives we need (so our code can't be used), but it also doesn't seem to be generating the expected instructions directly (as claimed). Commenting out to un-break tests; issue has been reported to wasm/llvm team.	17 April 2021, 17:46:20 UTC
9cdb4aa	Andrew Adams	16 April 2021, 22:23:30 UTC	Simplify and improve cuda_mat_mul schedule (#5909) * Simplify and improve cuda_mat_mul schedule	16 April 2021, 22:23:30 UTC
a41cce7	Volodymyr Kysenko	16 April 2021, 20:47:16 UTC	Basic support of predicated loads/stores in C++ backend (#5908) * Basic support of predicated load/stores in C++ backend * Fix formatting and maybe build * Fix * trigger buildbots Co-authored-by: Steven Johnson <srj@google.com>	16 April 2021, 20:47:16 UTC
3531167	Steven Johnson	15 April 2021, 18:34:37 UTC	Drop LLVM10 support from master (#5740) * Drop LLVM10 support from master Update build files to require LLVM11+ in master branch. (Since we only regularly test master with 12 and 13 this is conservative.) Remove all code that is specialized for LLVM < 11.0. * Update CodeGen_ARM.cpp * Update CodeGen_LLVM.cpp	15 April 2021, 18:34:37 UTC
780ebd2	Zalman Stern	15 April 2021, 16:19:21 UTC	Add an error for realize with a different number of outputs than defined for pipeline. (#5906) * Add an error for calling realize with a different number of outputs than the pipeline was compiled with. * Forgot to add test. * A readability scarifice to the clang deity. * Add CMake file. * Minor change to error text. * Fix logic to handle Funcs returning Tuples. * Formatting.	15 April 2021, 16:19:21 UTC
da02c0d	Jiawen (Kevin) Chen	14 April 2021, 22:17:01 UTC	Add missing "struct" before halide_type_t. (#5904) This allows it to compile as pure C instead of C++. Co-authored-by: Jiawen Chen <jiawen@adobe.com>	14 April 2021, 22:17:01 UTC
ccde965	Steven Johnson	14 April 2021, 21:32:29 UTC	Enable some more wasm simd tests that are now working with top-of-tree LLVM. (#5903)	14 April 2021, 21:32:29 UTC
3ac277b	Dillon Sharlet	14 April 2021, 17:23:02 UTC	Rewrite double and triple narrowing on ARM (#5896) * Rewrite double and triple narrowing on ARM. * clang-format. Co-authored-by: Steven Johnson <srj@google.com>	14 April 2021, 17:23:02 UTC
ce9b324	Steven Johnson	14 April 2021, 16:18:45 UTC	Fix UB in halide_buffer_t::size_in_bytes (#5898) Just a port of https://github.com/halide/Halide/pull/4389 to the equivalent methods in HalideRuntime.h, since offset-from-a-null-pointer is UB in C++.	14 April 2021, 16:18:45 UTC
1ff3e3f	Mario Emmenlauer	12 April 2021, 20:50:02 UTC	CMake build: Add more user control (#5859) * packaging/CMakeLists.txt: Allow users to override RPATH (i.e. for packaging Halide) * CMakeLists.txt: Allow users to override the C++ standard	12 April 2021, 20:50:02 UTC
9cc17b4	Alexander Root	12 April 2021, 17:23:57 UTC	Add fuzzer to bounds_of_expr_in_scope + fix discovered overflow bugs (#5895) * add interval bounds fuzzer * correct overflow checks in bounds inference * catch uint32->int32 overflow in simplifier and revert bounds change	12 April 2021, 17:23:57 UTC
687c7d8	Andrew Adams	09 April 2021, 05:04:11 UTC	Use guarded versions of vars if they exist in bounds inference (#5890)	09 April 2021, 05:04:11 UTC
cf40bc8	Steven Johnson	08 April 2021, 17:14:26 UTC	Improve wasm_threads documentation (#5843) * Improve wasm_threads documentation * Update HalideRuntime.h	08 April 2021, 17:14:26 UTC
71b895e	Alex Reinking	18 February 2021, 21:44:02 UTC	Fix existing presets (remove -O2 stuff, typos)	07 April 2021, 22:51:53 UTC
bd16b37	Alex Reinking	18 February 2021, 21:42:47 UTC	Add shebang line to autotune_loop.sh	07 April 2021, 22:51:53 UTC

Newer
Older