https://github.com/halide/Halide
Name Target Message Date
HEAD 4e0b313 Rewrite IREquality to use a more compact stack instead of deep recursion (#8198) * Rewrite IREquality to use a more compact stack instead of deep recursion Deletes a bunch of code and speeds up lowering time of local laplacian with 20 pyramid levels by ~2.5% * clang-tidy * Fold in the version of equal in IRMatch.h/cpp * Add missing switch breaks * Add missing comments * Elaborate on why we treat NaNs as equal 18 April 2024, 19:48:59 UTC
refs/heads/Halide_unsharp 61c1b40 Merge pull request #3458 from white-pony/master Allocate hexagon runtime arguments buffers on the heap if there are too many arguments 04 December 2018, 00:00:00 UTC
refs/heads/abadams/aggressive_is_single_point b15a648 clang-tidy 18 April 2024, 22:39:07 UTC
refs/heads/abadams/align_strided_const_loads ed529e0 Align the base when doing strided loads from constant addresses When we codegen something like f[ramp(x + 1, 2, 16)], where f is an internal allocation, we subtract the 1, do the dense load f[ramp(x, 1, 32)] and then take the odd lanes of the result. The reason for this is that it's likely that there's an f[ramp(x, 2, 16)] nearby, and aligning down the x+1 to x means we can share the dense loads and just deinterleave. This PR does the same when there's no x, just an odd constant. This means that cases like f[ramp(64, 2, 16)] + f[ramp(65, 2, 16)] now generate much better assembly. In one case I have it speeds up an entire pipeline by 8%, because aligning the loads in this way causes them to all be promoted off the stack into registers. 29 November 2020, 22:07:28 UTC
refs/heads/abadams/alloca 3fa94ab Fix comment location 07 October 2021, 23:31:27 UTC
refs/heads/abadams/atomic_parallel_compiled_in 407d308 Compile leaf parallel loops using an internal atomic counter 06 November 2020, 20:03:09 UTC
refs/heads/abadams/atomic_vector_non_recursive 22c2530 Remove dead Vars 13 February 2023, 19:27:53 UTC
refs/heads/abadams/averaging_tree bc10623 Merge branch 'abadams/averaging_tree' of https://github.com/halide/Halide into abadams/averaging_tree 26 April 2022, 17:38:39 UTC
refs/heads/abadams/avoid_name_mangling_in_cross_module_dependencies 13f388d Merge remote-tracking branch 'origin/master' into abadams/avoid_name_mangling_in_cross_module_dependencies 25 August 2020, 21:15:53 UTC
refs/heads/abadams/better_absd 86dfde4 typo 06 January 2022, 19:03:20 UTC
refs/heads/abadams/better_codegen_for_non_const_ramps 6721d40 Better codegen for ramps with non-const stride 20 November 2020, 22:20:41 UTC
refs/heads/abadams/bgu_cholesky 1707c0a Address review comments 01 February 2021, 18:54:47 UTC
refs/heads/abadams/braces_around_statements ecad269 Use switch statement instead of if sequence 05 October 2020, 23:56:18 UTC
refs/heads/abadams/cache_tighten_producer_consumer_nodes e848ee8 Merge remote-tracking branch 'origin/main' into abadams/cache_tighten_producer_consumer_nodes 21 February 2024, 18:54:24 UTC
refs/heads/abadams/check_reorder_dups 8cee0da Check for duplicate vars in calls to reorder/reorder_storage 23 August 2020, 21:39:07 UTC
refs/heads/abadams/clarify_broadcast_shuffle d13bfa8 Revert accidental change 18 March 2024, 16:00:29 UTC
refs/heads/abadams/compositing_app 8b5ca06 Revert inclusion of cmath 22 June 2023, 22:05:22 UTC
refs/heads/abadams/cond_wait_spin 2bee115 Merge branch 'master' into abadams/cond_wait_spin 01 February 2021, 17:59:31 UTC
refs/heads/abadams/cse_in_unroll_split_tuples d1c71d0 Merge branch 'master' into abadams/cse_in_unroll_split_tuples 15 December 2021, 00:59:07 UTC
refs/heads/abadams/custom_cuda_context 0b14ec0 Comment clarifications 15 October 2021, 20:59:48 UTC
refs/heads/abadams/custom_cuda_context_2 d3df50f Clean up comments 25 October 2021, 20:41:50 UTC
refs/heads/abadams/custom_cuda_context_3 d0cdc15 Improve comments 27 October 2021, 00:51:39 UTC
refs/heads/abadams/d3d12abi 75b4f0d Rename d3d12 modules to windows_d3d12 to simplify build Also clobber invalid module flags from generic modules 14 August 2020, 17:27:21 UTC
refs/heads/abadams/deflake_mullapudi_reorder cc6e06d Increase test threshold for mullapudi histogram test It uses fine-grained parallelism, which has a very noisy runtime. 03 April 2023, 23:43:52 UTC
refs/heads/abadams/delete_prepare_for_early_exit 5a9d2ee Merge remote-tracking branch 'origin/main' into abadams/delete_prepare_for_early_exit 11 November 2023, 17:14:52 UTC
refs/heads/abadams/depthwise_separable_conv 64fcd56 Rename some variables 25 August 2020, 19:20:51 UTC
refs/heads/abadams/diagnose_boundary_condition_failure 6b32fa2 Merge branch 'abadams/1v3_linear_comparison_cancellations' into abadams/diagnose_boundary_condition_failure 24 June 2020, 01:42:10 UTC
refs/heads/abadams/disable_onnx_app_on_mac f3b548f Skip onnx app on mac 01 September 2023, 20:43:19 UTC
refs/heads/abadams/divide_using_pavgw a12b3cb Add comment elaborating on why this is a good idea 15 October 2021, 20:00:33 UTC
refs/heads/abadams/dont_link_to_cudart 6004e5f Don't link to cudart or opencl library. These are loaded dynamically when required. 19 July 2023, 16:47:18 UTC
refs/heads/abadams/dont_reinterpret_concat 94d7f01 Don't reinterpret cast when codegenning vector concat It confuses the HVX LLVM backend, and shouldn't be necessary anyway. 02 July 2021, 17:23:21 UTC
refs/heads/abadams/early_out ec551ee Appease clang-tidy 13 June 2022, 18:38:59 UTC
refs/heads/abadams/enable_f16c f7776c8 Merge remote-tracking branch 'origin/main' into abadams/enable_f16c 06 September 2023, 17:06:20 UTC
refs/heads/abadams/extract_concat_bits 0457109 Fix concat_bits call 13 August 2022, 22:15:34 UTC
refs/heads/abadams/fast_integer_divide_round_to_zero f215365 Pacify clang tidy 30 November 2021, 22:02:00 UTC
refs/heads/abadams/faster_runtime_integer_division 2806116 Cleaner initialization of tables 23 November 2021, 18:44:57 UTC
refs/heads/abadams/faster_substitute_facts 07672fe Merge remote-tracking branch 'origin/main' into abadams/faster_substitute_facts 18 April 2024, 19:49:36 UTC
refs/heads/abadams/faster_unroll 5012aba Fix computational complexity of unrolling large muxes 03 February 2021, 20:49:10 UTC
refs/heads/abadams/fix-arm-seg2 4f20718 Merge remote-tracking branch 'origin/master' into abadams/fix-arm-seg2 05 March 2021, 23:29:40 UTC
refs/heads/abadams/fix_4211 76b8cbc Merge branch 'main' into abadams/fix_4211 15 June 2023, 00:48:33 UTC
refs/heads/abadams/fix_5323 be50f8a Add --help flag to rungenmain, fixing #5323 26 October 2021, 19:47:18 UTC
refs/heads/abadams/fix_5329 72224e1 Add explicit cast to remove ambiguous operator== (Fixes #5329) 05 April 2021, 17:06:25 UTC
refs/heads/abadams/fix_5889 15abc45 Use guarded versions of vars if they exist in bounds inference 08 April 2021, 19:42:48 UTC
refs/heads/abadams/fix_6984 88509a8 fix typo 02 March 2023, 19:33:55 UTC
refs/heads/abadams/fix_7229 3c055c9 Actually perform the requested operation 12 December 2022, 22:10:41 UTC
refs/heads/abadams/fix_7260 4e9f812 Merge branch 'abadams/fix_7260' of https://github.com/halide/Halide into abadams/fix_7260 01 January 2023, 23:01:45 UTC
refs/heads/abadams/fix_7365 fe3fb36 Overflow on casts is fine for ints < 32 bits 20 February 2023, 17:33:09 UTC
refs/heads/abadams/fix_7374 646e53c Add test 24 February 2023, 23:08:18 UTC
refs/heads/abadams/fix_7504 57c484f Add missing test 12 April 2023, 23:00:56 UTC
refs/heads/abadams/fix_7514 dfe07b0 Silence clang-tidy 17 April 2023, 22:04:33 UTC
refs/heads/abadams/fix_7531 d10c6fd Fix inverted may_subtile checks 12 June 2023, 17:29:45 UTC
refs/heads/abadams/fix_7584 7baedca Fix operator/ on ModulusRemainder It wasn't reducing the remainder modulo the modulus, which confused trim_bounds_using_alignment in the simplifier. 31 May 2023, 21:40:28 UTC
refs/heads/abadams/fix_7584_v2 1006c4e Fix operator/ on ModulusRemainder It wasn't reducing the remainder modulo the modulus, which confused trim_bounds_using_alignment in the simplifier. 31 May 2023, 21:40:28 UTC
refs/heads/abadams/fix_7742 3a79f46 Remove accidental return 04 August 2023, 21:05:01 UTC
refs/heads/abadams/fix_7756 0973abd Add success print 26 September 2023, 20:47:16 UTC
refs/heads/abadams/fix_7761 46fb1e3 Add test 25 September 2023, 21:47:22 UTC
refs/heads/abadams/fix_7768 5db62a0 Add test 21 August 2023, 21:26:29 UTC
refs/heads/abadams/fix_7786 011d42b Don't inject undef() in the simplifier We shouldn't be using undef() in the simplifier. This replaces a load with a constant false predicate with a zero instead. I also added a guard around some dubious logic about out of bounds loads. out of bounds loads may be reachable if they have a false predicate, so I changed this simplification to only trigger if the load is unpredicated. 21 August 2023, 20:52:00 UTC
refs/heads/abadams/fix_7810 fb06e94 trigger buildbots 29 November 2023, 22:47:14 UTC
refs/heads/abadams/fix_7811 ed1a5dd Merge branch 'main' into abadams/fix_7811 28 November 2023, 15:22:03 UTC
refs/heads/abadams/fix_7815 dabc935 Merge remote-tracking branch 'origin/main' into abadams/fix_7815 01 September 2023, 04:28:35 UTC
refs/heads/abadams/fix_7867 153709b trigger buildbots 29 November 2023, 22:46:45 UTC
refs/heads/abadams/fix_7871 b2e3cc3 Merge branch 'abadams/fix_riscv_vx_vi' into abadams/fix_7871 04 October 2023, 19:22:12 UTC
refs/heads/abadams/fix_7872 47a209d Merge remote-tracking branch 'origin/main' into abadams/fix_7872 05 October 2023, 16:16:26 UTC
refs/heads/abadams/fix_7873 b6132ef Don't deduce unreachability from predicated out of bounds stores Fixes #7873 03 October 2023, 23:52:16 UTC
refs/heads/abadams/fix_7888 022bcd5 Don't try to construct illegal types 11 October 2023, 19:14:31 UTC
refs/heads/abadams/fix_7890 10687b5 Fix rfactor adding too many pure loops When you rfactor an update definition, the new update definition must use all the pure vars of the Func, even though the one you're rfactoring may not have used them all. We also want to preserve any scheduling already done to the pure vars, so we want to preserve the dims list and splits list from the original definition. The code accounted for this by checking the dims list for any missing pure vars and adding them at the end (just before Var::outermost()), but this didn't account for the fact that they may no longer exist in the dims list due to splits that didn't reuse the outer name. In these circumstances we could end up with too many pure loops. E.g. if x has been split into xo and xi, then the code was adding a loop for x even though there were already loops for xo and xi, which of course produces garbage output. This PR instead just checks which pure vars are actually used in the update definition up front, and then uses that to tell which ones should be added. Fixes #7890 09 February 2024, 19:20:56 UTC
refs/heads/abadams/fix_7891 5598c35 Merge remote-tracking branch 'origin/main' into abadams/fix_7891 18 October 2023, 17:57:49 UTC
refs/heads/abadams/fix_7892 e26ce62 Merge remote-tracking branch 'origin/main' into abadams/fix_7892 16 October 2023, 17:15:15 UTC
refs/heads/abadams/fix_7893 476e1f7 Merge remote-tracking branch 'origin/main' into abadams/fix_7893 16 October 2023, 17:15:34 UTC
refs/heads/abadams/fix_7906 08afbbc Stop interleaver from expanding the scope of letstmts In the following code: let a = b in X let a = c in Y If Stmt X successfully had stores interleaved, it was re-nesting it like so: let a = b in X let a = c in Y This introduces a shadowed variable 'a', which is illegal at this stage of lowering. Fixes #7906 Also some drive-by fixes to earlier tests that had debugging code left in. 19 October 2023, 17:12:31 UTC
refs/heads/abadams/fix_7909 b3507f9 Merge branch 'main' into abadams/fix_7909 20 October 2023, 17:23:13 UTC
refs/heads/abadams/fix_7968 0ad79da Add missing print 05 December 2023, 18:09:11 UTC
refs/heads/abadams/fix_8038 ae04001 trigger buildbots 26 January 2024, 01:50:10 UTC
refs/heads/abadams/fix_8054 fa88d14 Fix type error in VectorizeLoops 01 February 2024, 01:19:21 UTC
refs/heads/abadams/fix_8170 da4d491 Merge branch 'main' into abadams/fix_8170 16 April 2024, 16:42:42 UTC
refs/heads/abadams/fix_8184 8155454 Don't print on parallel task entry/exit with -debug flag Fixes #8184 09 April 2024, 18:28:59 UTC
refs/heads/abadams/fix_arm_fcvtmp c7cb4c4 Add support for fcvtm/p, make scalars go through pattern matching too 12 March 2024, 19:44:58 UTC
refs/heads/abadams/fix_autoschedule_feature_transposition 0e361d4 Fix transposed variable names 29 July 2020, 18:19:45 UTC
refs/heads/abadams/fix_cse_name_collisions 83b07f1 Merge remote-tracking branch 'origin/main' into abadams/fix_cse_name_collisions 01 September 2023, 03:00:32 UTC
refs/heads/abadams/fix_cuda_mat_mul_assert f7d1a8f Merge branch 'master' into abadams/fix_cuda_mat_mul_assert 19 June 2020, 02:09:18 UTC
refs/heads/abadams/fix_deinterleave_bug 987f531 Remove buggy deinterleave misfeature 24 March 2021, 00:08:17 UTC
refs/heads/abadams/fix_deinterleave_for_reinterpret 1772c1f Minimal approach to making Deinterleave correct for Reinterpret 05 August 2022, 19:30:20 UTC
refs/heads/abadams/fix_div_round_to_zero 108dcea Add missing print 11 September 2022, 21:32:42 UTC
refs/heads/abadams/fix_fft_compile_time_regression 2a8ced8 Merge branch 'master' into abadams/fix_fft_compile_time_regression 01 December 2020, 18:46:33 UTC
refs/heads/abadams/fix_generate_output_snippets d638d81 Rename LINES to INTERESTING_LINES Some terminals treat LINES as a special var, breaking this script 23 September 2020, 19:34:55 UTC
refs/heads/abadams/fix_if_nesting_condition 84b0aee clang-format 19 November 2023, 01:13:54 UTC
refs/heads/abadams/fix_leaks_in_memoize_test 0e85be4 Fix comment 04 August 2023, 23:47:58 UTC
refs/heads/abadams/fix_lgtm_warnings db22a23 Fix a few warnings from lgtm.com 21 February 2021, 03:34:46 UTC
refs/heads/abadams/fix_links_to_master 166a748 Fix some dead links to the 'master' branch 20 October 2022, 16:56:02 UTC
refs/heads/abadams/fix_load_of_broadcast 0dc03ee Handle loads of broadcasts in FlattenNestedRamps With sufficiently perverse schedules, it's possible to end up with a load of a broadcast index (rather than a broadcast of a scalar load). This made FlattenNestedRamps divide by zero. Unfortunately this happened in a complex production pipeline, so I'm not entirely sure how to reproduce it. For that pipeline, this change fixes it and produces correct output. 06 March 2024, 19:17:59 UTC
refs/heads/abadams/fix_lossless_cast_of_sub 66c56f1 Fix some UB 01 April 2024, 20:35:01 UTC
refs/heads/abadams/fix_onnx_app 32d529a Don't test onnx app in a 32-bit build 11 July 2023, 00:57:37 UTC
refs/heads/abadams/fix_pointless_lower_condition 91d87d7 Merge remote-tracking branch 'origin/main' into abadams/fix_pointless_lower_condition 12 March 2024, 16:48:53 UTC
refs/heads/abadams/fix_potential_gpu_deadlock e3606cc Fix GPU barrier deadlocks Partition loops shouldn't mess with serial loops containing thread barriers, potentially causing warp divergence and deadlock (seen in some obscure lens blur schedules). Also we were generating too many thread barriers in a branch where the base mutator class was accidentally always mutating something, so there's a change to FuseGPUThreadLoops to make it more bug-resistant. Without these additional barriers I have been unable to come up with a case where a barrier ends up somewhere that would deadlock, so no test. 13 August 2020, 17:37:30 UTC
refs/heads/abadams/fix_realize_condition_depends_on_tuple 4a3df05 Fix bug when realize condition depends on tuple call If the realization is tuple-valued, and the condition on the realization uses a tuple call (index != 0), then the condition wasn't getting resolved during the split_tuples pass. The cause was a missing mutate call. 03 August 2022, 22:06:50 UTC
refs/heads/abadams/fix_reduce_expr_modulo_of_vector 0afb878 Fix test 12 February 2024, 22:26:36 UTC
refs/heads/abadams/fix_riscv_vx_vi 33fa8a6 Fix for llvm trunk 04 October 2023, 19:02:13 UTC
refs/heads/abadams/fix_round 5c063bb Merge branch 'main' into abadams/fix_round 26 September 2022, 18:49:16 UTC
refs/heads/abadams/fix_stencil_chain_gpu_schedule b8ad19f Schedule last stage of stencil chain on GPU too 11 August 2020, 19:12:57 UTC
back to top