Revision history - refs/heads/rootjalex/improve_constant_bounds - origin: https://github.com/halide/Halide

visit type:

Revision	Author	Date	Message	Commit Date
393cf58	Alexander Root	27 September 2021, 20:52:25 UTC	fix up some documentation	27 September 2021, 20:52:25 UTC
9878ac9	Alexander Root	15 September 2021, 17:10:18 UTC	rm finished TODOs	15 September 2021, 17:10:18 UTC
483331f	Alexander Root	09 September 2021, 15:46:38 UTC	minor code clean up + allow disabling approximate methods	09 September 2021, 15:46:38 UTC
52ada06	Alexander Root	07 September 2021, 15:49:21 UTC	Merge branch 'rootjalex/improve_constant_bounds' of github.com:halide/Halide into rootjalex/improve_constant_bounds	07 September 2021, 15:49:21 UTC
db050ab	Alexander Root	07 September 2021, 15:40:53 UTC	Merge branch 'master' of github.com:halide/Halide into rootjalex/improve_constant_bounds	07 September 2021, 15:40:53 UTC
f487434	Alexander Root	07 September 2021, 15:37:02 UTC	fix CMakeLists.txt	07 September 2021, 15:37:02 UTC
6e416e9	Alexander Root	07 September 2021, 15:35:40 UTC	fix weird Makefile spacing	07 September 2021, 15:35:40 UTC
c7621f3	Alexander Root	07 September 2021, 15:33:50 UTC	update constant bounds methods + fix opt combo bug	07 September 2021, 15:33:50 UTC
b78b205	Steven Johnson	02 September 2021, 17:58:52 UTC	Upgrade clang-format and clang-tidy to LLVM-12 (#6233)	02 September 2021, 17:58:52 UTC
24d6bd6	Steven Johnson	29 August 2021, 20:02:34 UTC	Hoist unrolled prefetches to top of the block (#6230) * Hoist unrolled prefetches to top of the block When a loop with prefetch is unrolled, the prefetch instructions getting scattered through the loop can cause LLVM codegen issues in some cases (see https://bugs.llvm.org/show_bug.cgi?id=51172). As a partial mitigation for that issue, this PR adds a pass to hoist all prefetch instructions to the top of their loop. This is still a bit experimental; it definitely addresses the codegen issues we see, but makes the use of prefetch potentially less effective (since the hoisted prefetch may be too far from the eventual use to be effective). * appease clang-tidy * Avoid quadratic behavior * Use template instead of std::function * Require prefetch offset to be pure	29 August 2021, 20:02:34 UTC
085e11e	Andrew Adams	27 August 2021, 21:00:14 UTC	Rename inner version of bounds_of_expr_in_scope (#6232) It's not in the explicit namespace that it's requested in (Halide::Internal), so turning on that debugging code results in compile failures. I just gave it a different name to disambiguate.	27 August 2021, 21:00:14 UTC
c860cab	Steven Johnson	25 August 2021, 17:36:14 UTC	Add modernize-make-shared and modernize-make-unique to .clang-tidy and fix warnings (#6222) * Add modernize-make-shared and modernize-make-unique to .clang-tidy and fix warnings * std::initializer_list instead of std::vector * Update Pipeline.h	25 August 2021, 17:36:14 UTC
f43f016	Steven Johnson	24 August 2021, 22:25:57 UTC	More prefetch fixes (#6226) * More prefetch fixes - Arguments to Call::prefetch() must be scalars, not vectors - Add more testcases to correctness_prefetch Addresses more of #6219 (but still not the title issue, i.e. ignoring offset) * Fix horrific bug * Have CodeGen_C emit the same arguments for __builtin_prefetch() as the runtime module * Minor cleanup * Explicitly pass target thru * Fix correctness_prefetch for host-hvx * Add comments	24 August 2021, 22:25:57 UTC
d507b9a	Steven Johnson	23 August 2021, 21:46:26 UTC	Fix bug in prefetch() (#6225) In #6155, we incorrectly assume that we can qualify the 'from' prefetch var by just adding 'prefix'; this isn't true if (e.g.) there are any splits involved. Instead, we need to walk through the active loops to find a suitable match. In addition, if no match is found, we now fail with an error (rather than quietly doing something undefined), as the 'from' var is required to be from an active loop. (Addresses some-but-not-all of #6219)	23 August 2021, 21:46:26 UTC
c849dcb	Alexander Root	23 August 2021, 18:12:43 UTC	Merge branch 'master' of github.com:halide/Halide into rootjalex/improve_constant_bounds	23 August 2021, 18:12:43 UTC
30040cd	Steven Johnson	23 August 2021, 16:31:16 UTC	Prefetch cleanup (#6220) * Use std::move where appropriate * Prefetch cleanup This is (mostly) a cleanup pass to make the flow of Prefetch injection & lowering more obvious to the reader of the code (via commenting and minor code restructuring). Notable exception: the HVX backend processing of Call::prefetch (and relevant runtime code) was refactored to make it (IMHO) less janky. (Also some drive-by insertions of std::move where appropriate)	23 August 2021, 16:31:16 UTC
7c437e4	Alexander Root	22 August 2021, 20:27:39 UTC	fix #6207 (#6214) Co-authored-by: Steven Johnson <srj@google.com>	22 August 2021, 20:27:39 UTC
7aafbb9	Steven Johnson	20 August 2021, 22:53:06 UTC	Fix wasm simd issues (#6217) - f64x2.convert_low_i32x4_s/u are now generating proerly at top-of-tree, so re-enable them - f64x2.promote_low_f32x4 is temporarily broken for larger vector widths, so disable it for now (issue is reported and fix is underway) Also, driveby change to .gitignore.	20 August 2021, 22:53:06 UTC
ed5e1e1	Steven Johnson	20 August 2021, 22:25:50 UTC	Use C++17 structured binding instead of std::tie (#6213) * Use C++17 structured binding instead of std::tie * appease clang-tidy	20 August 2021, 22:25:50 UTC
adf64fd	Alexander Root	20 August 2021, 19:40:01 UTC	Merge branch 'master' of github.com:halide/Halide into rootjalex/improve_constant_bounds	20 August 2021, 19:40:01 UTC
7079ff2	Steven Johnson	20 August 2021, 19:11:41 UTC	Fix for upcoming LLVM API change (#6212)	20 August 2021, 19:11:41 UTC
92900d2	Alexander Root	20 August 2021, 18:54:33 UTC	rename ApproxDifferences -> ConstantBounds	20 August 2021, 18:54:33 UTC
06e8865	Dillon Sharlet	20 August 2021, 02:11:04 UTC	Fix issues with predicated interleaving stores on Hexagon (#6211) * Fix issues with predicated interleaving stores on Hexagon. * Fix buffer API usage issue. * Add default device API support for Hexagon. * More DeviceAPI support	20 August 2021, 02:11:04 UTC
c61a930	Alexander Root	19 August 2021, 17:06:04 UTC	Fix unroll failures from adams2019 when the Expr depends on estimates (#6200) * track depends_on_estimate in BoundsInfo - fix bounds_are_constant	19 August 2021, 17:06:04 UTC
9363334	Alexander Root	18 August 2021, 20:29:46 UTC	actually merge with find_constant_bounds	18 August 2021, 20:29:46 UTC
f8d092d	Alexander Root	18 August 2021, 19:52:30 UTC	clang format	18 August 2021, 19:52:30 UTC
9698d42	Alexander Root	18 August 2021, 16:59:10 UTC	add forward-facing approximate bounds functions	18 August 2021, 16:59:10 UTC
0f4e869	Alexander Root	18 August 2021, 16:43:32 UTC	add reorder_terms and substitute_some_lets	18 August 2021, 16:43:32 UTC
7900db3	Alexander Root	18 August 2021, 00:00:39 UTC	add strip_unbounded_terms	18 August 2021, 00:00:39 UTC
8833a45	Alexander Root	17 August 2021, 23:08:15 UTC	add push_rationals	17 August 2021, 23:08:15 UTC
d653a73	Steven Johnson	17 August 2021, 21:15:38 UTC	Add IRMutator::mutate_exprs() (#6203) * Add IRMutator::mutate_exprs() There's a common pattern in many IRMutators that is "mutate a vector<Expr> and optionally let me know if anything is different". (Note, this uses C++17 structured-binding syntax, which we previously weren't using in Halide. Objections?) This adds a shared utility method (well, two, thanks to VariadicVisitor) and plus in the usage in all the places that seemed obvious. I doubt this moves the needle on speed in either direction, but makes for smaller code. * Silence warnings * Update Inline.cpp * Update ParallelRVar.cpp * Update SplitTuples.cpp * Update StorageFlattening.cpp * Revisions	17 August 2021, 21:15:38 UTC
d811a3f	Steven Johnson	16 August 2021, 20:34:58 UTC	More augmentation of debugging code (#6185) * More augmentation of debugging code This expands on #6182 by added tracking for BoxesTouched, and integrating the nesting levels with the previous code. This allows a more complete vision of what's happening during bounds calculation. * Minor fixes * Unexpose indent	16 August 2021, 20:34:58 UTC
72284a2	Steven Johnson	13 August 2021, 19:12:07 UTC	unsafe_promise_clamped() should be pure (#6199) As discussed in https://github.com/halide/Halide/pull/6189, this intrinsic should probably be Pure.	13 August 2021, 19:12:07 UTC
a081660	Zalman Stern	12 August 2021, 21:06:38 UTC	Add information to comment on ```align_loads```. (#6196) * Add information to comment. * Wording improvement.	12 August 2021, 21:06:38 UTC
3b7e1ba	Steven Johnson	12 August 2021, 20:55:14 UTC	[hannk] Remove alignment requirements for shallow DepthwiseConv ops (#6198) * [hannk] Remove alignment requirements for shallow DepthwiseConv ops * Update depthwise_conv_generator.cpp	12 August 2021, 20:55:14 UTC
6229afa	Steven Johnson	12 August 2021, 20:15:28 UTC	Upgrade apps/hannk to TFLite 2.6 (#6197) * Upgrade apps/hannk to TFLite 2.6 * Remove scalpel left in patient	12 August 2021, 20:15:28 UTC
69075b4	Steven Johnson	12 August 2021, 19:51:15 UTC	Internal::promise_clamped() should be pure (Fixes #6186) (#6189) * ApplySplit should use pure promise_clamped() (Fixes #6186) * Make all promise_clamped calls pure * pure_promise_clamped -> promise_clamped	12 August 2021, 19:51:15 UTC
2394250	aankit-ca	12 August 2021, 17:06:00 UTC	[Hexagon] Do not pattern match inside if_then_else block (#6194) * Do not pattern match inside if_then_else block Resolves the compilation below compilation error while generating hannk::upsample_channels_uint8 from hannk/depthwise_conv.generator: Unknown intrinsic dynamic_shuffle The problem occurs when we pattern match hvx instrinsics inside if_then_else nodes and try to scalarize them later. In the patch we prevent matching these intrinsics inside if_then_else blocks. * Do not match for only vector types * pattern match for scalars and scalar-broadcasts Co-authored-by: Ankit Aggarwal <aankit@quicinc.com>	12 August 2021, 17:06:00 UTC
43b412b	Steven Johnson	12 August 2021, 17:02:52 UTC	Update WABT version to latest release (1.0.24) (#6193)	12 August 2021, 17:02:52 UTC
b7fa882	Steven Johnson	11 August 2021, 21:28:10 UTC	Fix unused-variable warning-as-error (#6192) The latest Emscripten compilers will complain about this.	11 August 2021, 21:28:10 UTC
67802cf	Steven Johnson	11 August 2021, 18:53:36 UTC	Add memmove to WasmExecutor callbacks (#6191) Some not-yet-landed variants of the wasm toolchain+runtime environments need this.	11 August 2021, 18:53:36 UTC
e8b5837	Steven Johnson	11 August 2021, 17:33:13 UTC	Add a watchdog timer to Generator (#6184) * Add a watchdog timer to Generator In degenerate conditions (eg, bugs in Halide or LLVM, or pathological user code), running a Generator can take arbitrarily long times (we recently found some buildbots that had Generators that had been running for several days). This adds a simple background thread to generator_main() to ensure that compilations don't take unreasonable lengths of time. It defaults to 15 minutes of wall-clock time, but can be customized by the -t flag. * Update Generator.cpp	11 August 2021, 17:33:13 UTC
fb44637	Steven Johnson	10 August 2021, 22:26:10 UTC	Augment debugging code (#6182) * Augment debugging code I upgraded some debugging code in AddImageChecks and Bounds while tracking down a bug, and I think the upgrades are worth keeping for future use. * clang-format * Minor changes per comments * clang-format	10 August 2021, 22:26:10 UTC
d249fa0	Andrew Adams	06 August 2021, 19:24:14 UTC	Update tutorial todos (#6161) This is based on our discussion in the dev meeting. Feel free to suggest changes.	06 August 2021, 19:24:14 UTC
451cfa8	Steven Johnson	06 August 2021, 15:58:33 UTC	Add argv and metadata support to C++ backend (Issue #2071) (#6179) * Add argv and metadata support to C++ backend (Issue #2071) * legalize_name-> c_print_name * Fix user_context handling	06 August 2021, 15:58:33 UTC
2e229f5	Evan Lee	04 August 2021, 04:54:30 UTC	Rewrite Rules Evaluation Project - Merging Relevant Synthesized Rewrite Rules (#6174) Conducted experiments to analyze the performance effects of adding 4000+ synthesized rewrite rules to Halide. Narrowed down the rules to 11 rewrite rules whose associative & commutative variants are added in this PR. With these rewrite rules, Halide achieves >10% peak memory reductions in 192 cases in apps including camera_pipe, harris, nl_means, and stencil_chain, which is similar to the results (with all 4000+ rules) from this paper - https://dl.acm.org/doi/pdf/10.1145/3428234	04 August 2021, 04:54:30 UTC
8b26454	Steven Johnson	03 August 2021, 20:06:53 UTC	Add more fine-grained prefetch() directive (Issue #3735) (#6155) Add more fine-grained prefetch() directive (Issue #3735)	03 August 2021, 20:06:53 UTC
4f8629c	Steven Johnson	03 August 2021, 00:49:54 UTC	Fix broken wasm-simd extmul instructions due to changes from https://reviews.llvm.org/D106724 (#6177)	03 August 2021, 00:49:54 UTC
0a09bfb	Steven Johnson	02 August 2021, 21:14:12 UTC	Fix for trunk LLVM (#6176) * Fix for trunk LLVM * More Fixes	02 August 2021, 21:14:12 UTC
e52d6ca	Alex Reinking	31 July 2021, 04:43:06 UTC	Fix Xcode issue that requires at least one source file when building a library from objects. (#6175) * Fix Xcode issue that requires at least one source file when building a library from objects. Fixes #6167 * add newline to end of file	31 July 2021, 04:43:06 UTC
a7e8c43	Dillon Sharlet	29 July 2021, 15:53:11 UTC	Partial revert of 8f849ae6514e83f8bf94d05e452a467df352f74c (only (#6173) reverting halide_remote.cpp).	29 July 2021, 15:53:11 UTC
36f6b8c	Alex Reinking	28 July 2021, 17:53:33 UTC	Use generic build command instead of make. Fixes #6163 (#6169)	28 July 2021, 17:53:33 UTC
2b8ec44	Steven Johnson	27 July 2021, 14:57:41 UTC	Remove deprecated realize() Python wrapprs (#6162) The C++ versions were removed in #6122, but the Python equivalents were overlooked.	27 July 2021, 14:57:41 UTC
a5585cb	Alexander Root	27 July 2021, 02:07:40 UTC	Add various bounds-related simplifier rules (#6160) * add simplifier rules	27 July 2021, 02:07:40 UTC
2ab9a56	Shoaib Kamil	24 July 2021, 13:55:32 UTC	De-predicate loads and stores in Metal/OpenCL/D3D12 backend (#6158) * Depredicate loads and stores in Metal backend * Fix typo. * Mark override, add additional using * float_t -> float * Update CMakeLists.txt * clang-format * Also scalarize in D3D12 and OpenCL * use const_true() helper	24 July 2021, 13:55:32 UTC
b68393c	Steven Johnson	21 July 2021, 23:08:12 UTC	[hannk] Add a --csv flag to compare_vs_tflite (#6149) * [hannk] Add optional taskset support to the run_on_device scripts * [hannk] Add a --csv flag to compare_vs_tflite This lets us output results in CSV format for easy copy/paste into (eg) spreadsheets.	21 July 2021, 23:08:12 UTC
025a9b9	Dillon Sharlet	21 July 2021, 22:11:55 UTC	Handle depth_multiplier != 1 in a separate op (#6154) * Implement depth_multiplier != 1 in a separate op. * Fix build on GCC * Remove stale comment * clang-format * Add more comments to inv_depth_multiplier	21 July 2021, 22:11:55 UTC
9d7284b	Dillon Sharlet	20 July 2021, 20:50:14 UTC	Move quantization to a helper function depending on the target (#6150) * Move quantization + relu to a helper function depending on the target. * clang-format * x86 has these too actually * Fix typo	20 July 2021, 20:50:14 UTC
5ca8cdf	Dillon Sharlet	20 July 2021, 16:43:26 UTC	Generalize Conv2D to be a Conv of any dimensionality (#6146) * Generalize Conv2D to be a Conv of any dimensionality. * clang-format	20 July 2021, 16:43:26 UTC
5812f33	Volodymyr Kysenko	20 July 2021, 15:45:49 UTC	Configurable minimum size for alignment in align_loads (#6143) Co-authored-by: Steven Johnson <srj@google.com>	20 July 2021, 15:45:49 UTC
b457d3c	Steven Johnson	20 July 2021, 02:16:25 UTC	Add support for int16 output in Conv2D (#6145) This allows us to convert all (currently supported) FC ops into Conv2D ops. Remove all the FC-specific Halide and Op code.	20 July 2021, 02:16:25 UTC
9d1e1e3	Steven Johnson	20 July 2021, 00:33:52 UTC	[hannk] Rewrite FC in terms of Conv2D (#6144) * [hannk] Rewrite FC in terms of Conv2D FullyConnected is very similar to Conv2D, so rather than maintaining multiple similar implementations, let's translate a FullyConnected node into a Conv2D node (with some Reshape nodes as necessary). Note that we keep the old FC logic for int16 outputs, as Conv2D doesn't support those yet; if this PR is landed, a followup PR will add that ability to Conv2D, and the existing FC support will be removed entirely.	20 July 2021, 00:33:52 UTC
bd7ebf5	Steven Johnson	19 July 2021, 19:27:44 UTC	Fix for top-of-tree LLVM (#6142) * Fix for top-of-tree LLVM	19 July 2021, 19:27:44 UTC
557c8e4	Dillon Sharlet	16 July 2021, 17:56:05 UTC	Fix Hexagon vrmpy with 16-bit results (#4248) (#6137) * Fix #4248 * clang-format	16 July 2021, 17:56:05 UTC
769b855	Dillon Sharlet	15 July 2021, 16:22:28 UTC	Add optimization for corner case in conv (#6139) * Add silly optimization for weird cases. * Use transpose	15 July 2021, 16:22:28 UTC
42e1d45	Steven Johnson	15 July 2021, 16:02:00 UTC	[hannk] Allow aliasing of Reshape tensors (#6138) * Allow aliasing of Reshape tensors Previously we didn't allow this because aliased tensors had to have the same rank, which is ~never the case for Reshape. Aliasing for Reshape is a huge win because it essentially becomes a no-op rather than a memcpy. Running against standard set of models shows no regression in differences vs. tflite.	15 July 2021, 16:02:00 UTC
19f2bc7	Dillon Sharlet	14 July 2021, 00:20:25 UTC	Reduce verbosity of compare_vs_tflite further (#6136)	14 July 2021, 00:20:25 UTC
802c22a	Andrew Adams	13 July 2021, 23:13:02 UTC	Don't reinterpret cast when codegenning vector concat (#6125) It confuses the HVX LLVM backend, and shouldn't be necessary anyway.	13 July 2021, 23:13:02 UTC
77207a5	Dillon Sharlet	13 July 2021, 23:04:21 UTC	Optimize shallow depthwise convolutions (#6134) * Add TailStrategy::PredicateLoads and TailStrategy::PredicateStores * Different compilers * PredicateStores is faster than specialize + ShiftInwards * Update comments. * Allow PredicateStores for RVars * Fix test to avoid realize bounds query issues. * Add comments. * clang-format * predicate* is not pure * Fix documentation bugs * Don't allow PredicateStores for reductions. * Substitute more strongly around Provide * Change these back to pure for now to satisfy some logic in ScheduleFunctions * Fix use after free of pred. * Update comments. * Refactor implementation of predication * Visit predicates * Partition loops with predicated loads/stores. * Clean up ApplySplit * Fix inappropriate predicated vectorization of VectorReduce * De-dup GuardWithIf and Predicate * These also handle scalar predicated loads/stores. * Print provide predicates * Don't allow predicated non-innermost splits. * Remove debugging code * Forgot to add new file * Add test to CMake build * Fix bug in simplification of extract_element * Fix issue with mixing uses of guarded expressions inside and outside calls. * Don't lift impure exprs. * clang-format * clang-format again * Add "shallow" version of depthwise for small numbers of channels. * Better name for input_stride_x * Fix performance regression in deep case. * Update performance * Missed rename * Enable tiling of shallow case. * Require x be a dummy dim for shallow depthwise * Small cleanup to avoid ternary * clang-format * Can't use shallow depthwise when stride_x != 1	13 July 2021, 23:04:21 UTC
a762c34	Dillon Sharlet	13 July 2021, 21:54:11 UTC	Add TailStrategy::PredicateLoads and TailStrategy::PredicateStores (#6126) * Add TailStrategy::PredicateLoads and TailStrategy::PredicateStores * Different compilers * PredicateStores is faster than specialize + ShiftInwards * Update comments. * Allow PredicateStores for RVars * Fix test to avoid realize bounds query issues. * Add comments. * clang-format * predicate* is not pure * Fix documentation bugs * Don't allow PredicateStores for reductions. * Substitute more strongly around Provide * Change these back to pure for now to satisfy some logic in ScheduleFunctions * Fix use after free of pred. * Update comments. * Refactor implementation of predication * Visit predicates * Partition loops with predicated loads/stores. * Clean up ApplySplit * Fix inappropriate predicated vectorization of VectorReduce * De-dup GuardWithIf and Predicate * These also handle scalar predicated loads/stores. * Print provide predicates * Don't allow predicated non-innermost splits. * Remove debugging code * Forgot to add new file * Add test to CMake build * Fix bug in simplification of extract_element * Fix issue with mixing uses of guarded expressions inside and outside calls. * Don't lift impure exprs. * clang-format * clang-format again	13 July 2021, 21:54:11 UTC
867b6c8	Steven Johnson	13 July 2021, 20:01:21 UTC	[hannk] Make compare_vs_tflite with --verbose 0 less noisy (#6135) Minor fixes to eliminate noise.	13 July 2021, 20:01:21 UTC
e705253	Steven Johnson	13 July 2021, 01:17:41 UTC	[hannk] Implement greedy algorithm in AllocationPlanner (#6117) * [hannk] Rework most of hannk's Tensor storage to be arena-based. * Update interpreter.cpp * Restore get_tensor * clang-format * Add missing include * Fix arena alignment issues * Remove redundant assert * Rework AllocationPlanner API a bit * [hannk] Implement greedy algorithm in AllocationPlanner This uses a basic greedy approach to doing an allocation plan for tensors in hannk. Initial testing shows exact result matches between old and new code. Drive-by changes: - Change Interpreter's `verbose` -> `verbosity` to allow more output granularity, and update callers as needed. - Fix two places in ModelRunner that should have called the function hooks rather than the functions directly. * clang-format * Add missing includes * Add missing includes * trigger buildbots * Minor fixes and comments in AllocationPlanner * Suggested fixes	13 July 2021, 01:17:41 UTC
3e9cb4f	Steven Johnson	12 July 2021, 23:22:05 UTC	Fix wasm regression at ToT LLVM (#6132) llvm.wasm.promote.low was removed. Calling fpext directly is the preferred approach now.	12 July 2021, 23:22:05 UTC
a2c47a9	aankit-ca	08 July 2021, 19:22:22 UTC	[Hexagon] Use LLVM masked stores. (#6129) * [Hexagon] Use LLVM masked stores. Letting CodeGen_LLVM handle predicated stores for Hexagon allows us to generate HVX predicated stores instead of scalar predicated stores. * Corrections to run haank on hexagon-sim Co-authored-by: Ankit Aggarwal <aankit@quicinc.com>	08 July 2021, 19:22:22 UTC
f48a8da	Zalman Stern	07 July 2021, 23:33:53 UTC	Adding padding byte size to outermost header byte count in MATLAB5 file format. (#6128) Adding padding byte size to outermost header byte count when writing MATLAB5 file format matrix. This ensures SciPy will successfully read files written by this routine.	07 July 2021, 23:33:53 UTC
27a2348	Zalman Stern	07 July 2021, 23:31:06 UTC	Track argument change to LLVM's CreateMaskedLoad. (#6130)	07 July 2021, 23:31:06 UTC
a914574	aankit-ca	02 July 2021, 04:56:36 UTC	[Hexagon] Makefile changes for hannk on Hexagon (#6066) * [Hexagon] Makefile changes for hannk on Hexagon Initial commit to get hannk app. Works on device. Qurt crash on sim. * Add missing file stubs.c * Run clang-format * correction * clang-format * clang-format * address comments * Run all tests * sim-constants in seperate file * add file * Changes * changes Co-authored-by: Ankit Aggarwal <aankit@quicinc.com> Co-authored-by: Steven Johnson <srj@google.com>	02 July 2021, 04:56:36 UTC
240f6a3	Alexander Root	01 July 2021, 14:55:28 UTC	Use bound_correlated_differences in find_constant_bounds (#6059) * move PartiallyCancelDifferences to outside of SimplifyCorrelatedDifferences * add a rule to address #6044 correlation * use bound_correlated_differences in find_constant_bounds	01 July 2021, 14:55:28 UTC
f7aa53b	Steven Johnson	30 June 2021, 19:24:19 UTC	Remove deprecated realize() variants from Func and PIpeline (#6122) These were deprecated in Halide 12. Let's remove them for Halide 13.	30 June 2021, 19:24:19 UTC
d1d7359	Andrew Adams	29 June 2021, 23:15:12 UTC	Relax overzealous pruning rule (#6115) We don't allow schedules that fuse to the extent that we can no longer vectorize. This was implemented incorrectly though. The check assumed that something was going to be compute_at inside the innermost loop, and neglected the possibility that we were about to tile that loop.	29 June 2021, 23:15:12 UTC
84b78da	Svenn-Arne Dragly	29 June 2021, 22:27:00 UTC	Fix potential undefined behavior in `set_flag` (#6118) Previously, the Clang UndefinedBehaviorSanitizer (UBSan) complained about potential undefined behavior in `halide_buffer_t::set_flag` because the enum `halide_buffer_flags` is interpreted as an int32_t and implicitly converted to a uint64_t: ``` runtime error: implicit conversion from type 'int' of value -2 (32-bit, signed) to type 'unsigned long' changed the value to 18446744073709551614 (64-bit, unsigned) ``` On most compilers and hardware, this causes no issues, since the conversion and implementation of `set_flag` together produce the expected behavior still. However, it is better to be on the safe side and make the explicit conversion to a uint64_t before doing the bitwise negation. This change makes sure the conversion from int32_t is made before the bitwise negation, which fixes the potential undefined behavior and keeps UBSan from complaining.	29 June 2021, 22:27:00 UTC
1b7f369	Steven Johnson	29 June 2021, 16:30:51 UTC	[hannk] Rework most of hannk's Tensor storage to be arena-based. (#6104) * [hannk] Rework most of hannk's Tensor storage to be arena-based.	29 June 2021, 16:30:51 UTC
408a277	Steve Suzuki	28 June 2021, 17:57:29 UTC	Float16 support in CodeGen_ARM (#6102) * Add definition of Target::ARMFp16 Add the definition of the feature for ARMv8.2-a half-precision floating point data processing * Added test to generate 'float16' neon assembly; * Add check for data type in float16 NEON test The test simd_op_check doesn't check the suffix of operand which indicates the data type in case of AArch64 NEON instruction. e.g. FADD V0.4S, V0.4S, V0.4S In order to distinguish instruction of fp16 from fp32, the suffix such as ".4S" in the above needs to be checked. * Generate float16 Arm aarch64 LLVM-IR Armv8-a extension of Half-precision floating point data processing is supported by CodeGen_ARM. The target needs to be set as 64-bit with "arm_fp16" feature. 32-bit is not supported in this commit. Upgrading fp16 to fp32 with emulated conversion is replaced with either fp16 native instruction or fp32 operation with native type conversion of fp16-fp32 * Fix format and comments for arm_fp16 feature Co-authored-by: Liam O'Neil <liam.oneil@arm.com>	28 June 2021, 17:57:29 UTC
bfd9cea	Kai Wolf	28 June 2021, 17:23:32 UTC	Update LoopNest.cpp (#6086) Remove obsolete assert for output accessing other outputs	28 June 2021, 17:23:32 UTC
2816567	Alex Reinking	26 June 2021, 00:39:42 UTC	Enable ubuntu packaging (#6113) * Revert "Disable Ubuntu Packaging Action (Issue #6111) (#6112)" This reverts commit 3f3dd702 * Explicitly update to avoid out of date package lists. Fixes #6111	26 June 2021, 00:39:42 UTC
0da1354	Steven Johnson	25 June 2021, 23:24:33 UTC	Avoid pathological cases in halide_benchmark() (#6110) In the variant that tries to compute a good samples/iters value based on min_time, there's a pathological case if the environment's timer is relatively coarse, and the op being profiled is relatively fast; in this case, you can end up with timings that are very close to zero (or literally zero), and our attempt to calculate the number of iterations can explode into the billions, making the benchmark appear to hang (as it may take an absurd length of time to run). To fix this, add a maximum value for iters_per_sample, and smarten the calculation for when the measured time is tiny.	25 June 2021, 23:24:33 UTC
3f3dd70	Steven Johnson	25 June 2021, 21:38:28 UTC	Disable Ubuntu Packaging Action (Issue #6111) (#6112) * Disable Ubuntu Packaging Action (Issue #6111)	25 June 2021, 21:38:28 UTC
7791e84	Dillon Sharlet	24 June 2021, 18:23:48 UTC	Remove floats from extern_producer (#6109) * Don't rely on floats/trig unnecessarily * Use different period	24 June 2021, 18:23:48 UTC
a987222	Dillon Sharlet	23 June 2021, 17:34:37 UTC	Remove likelies and promises before trying to check for monotonicity. (#6105)	23 June 2021, 17:34:37 UTC
2da7ca5	Alexander Root	23 June 2021, 06:12:06 UTC	Call simplify and remove_likelies for find_constant_bounds (#6099)	23 June 2021, 06:12:06 UTC
f285f08	Steven Johnson	22 June 2021, 22:09:33 UTC	[hannk] Minor cleanups (#6103) * [hannk] Minor cleanups * Restore get_tensor	22 June 2021, 22:09:33 UTC
93292a2	Steven Johnson	22 June 2021, 20:51:20 UTC	[hannk] Add --keep_going flag to ModelRunner (#6101) This allows you to run a compare operation against a bunch of graphs without exiting at the first one that is out-of-spec for comparison. (Useful when you want to verify that no new differences are introduced by a change.)	22 June 2021, 20:51:20 UTC
d82fec4	Steven Johnson	22 June 2021, 16:16:37 UTC	[hannk] Fix various build glitches for Bazel/Blaze (#6098) - Make small_vector.h standalone-compilable - Move Tensor::replace_all_consumers_with() to a local function near PadForOps to dodge a circular include dep between Tensor and Model	22 June 2021, 16:16:37 UTC
b94a526	Steven Johnson	21 June 2021, 23:11:55 UTC	[hannk] Replace Tensor::set_external_host with set_external_buffer (#6100) * Replace Tensor::set_external_host with set_external_buffer * Also remove stale comment	21 June 2021, 23:11:55 UTC
45f31f7	Dillon Sharlet	19 June 2021, 00:49:52 UTC	Fix is_monotonic issue (#6081) (#6083) * Fix #6081 * Slightly less bizarre implementation of select visitor. Co-authored-by: Steven Johnson <srj@google.com>	19 June 2021, 00:49:52 UTC
5aeb8db	Steven Johnson	17 June 2021, 23:04:26 UTC	[hannk] Don't mark Tensors as input or output (#6094) * Refactor transforms.cpp, no functional change * Use Op::is_input(), Op::is_output * Update configure_cmake.sh	17 June 2021, 23:04:26 UTC
d81f5c3	Volodymyr Kysenko	17 June 2021, 16:10:33 UTC	Provide bounds of rvars for all functions in the fused group (#6078) * Provide bounds of rvars for all functions in the fused group * Just use constant * Comments + rename variable	17 June 2021, 16:10:33 UTC
27ae113	Steven Johnson	17 June 2021, 00:57:38 UTC	[hannk] More Hygiene (#6093) * [hannk] More Hygiene - TensorStorage takes a more sensible set of args for ctor - Tensors don't need to be movable or copyable - Since we are now using C++17, we can use std::make_unique instead of make_op * Restore make_op * clang-format * Remove unnecessary TensorStorage methods	17 June 2021, 00:57:38 UTC
66ff71f	Steven Johnson	16 June 2021, 23:33:35 UTC	[hannk] Cleanup: move SmallVector, Tensor to their own source files (#6091) * Move SmallVector, Tensor to their own files * cleanup	16 June 2021, 23:33:35 UTC
a590c17	Steven Johnson	16 June 2021, 23:30:47 UTC	[hannk] Remove unused Op::clone() methods (#6092) We don't call these anymore, so remove them and the related TensorMap code.	16 June 2021, 23:30:47 UTC

Newer
Older