swh:1:snp:2c68c8bd649bf1bd2cf3bf7bd4f98d247b82b5dc
Name Target Message Date
HEAD a9ea9b5 Fix for top-of-tree LLVM (#7194) 02 December 2022, 00:17:48 UTC
refs/heads/Halide_unsharp 61c1b40 Merge pull request #3458 from white-pony/master Allocate hexagon runtime arguments buffers on the heap if there are too many arguments 04 December 2018, 00:00:00 UTC
refs/heads/abadams/align_strided_const_loads ed529e0 Align the base when doing strided loads from constant addresses When we codegen something like f[ramp(x + 1, 2, 16)], where f is an internal allocation, we subtract the 1, do the dense load f[ramp(x, 1, 32)] and then take the odd lanes of the result. The reason for this is that it's likely that there's an f[ramp(x, 2, 16)] nearby, and aligning down the x+1 to x means we can share the dense loads and just deinterleave. This PR does the same when there's no x, just an odd constant. This means that cases like f[ramp(64, 2, 16)] + f[ramp(65, 2, 16)] now generate much better assembly. In one case I have it speeds up an entire pipeline by 8%, because aligning the loads in this way causes them to all be promoted off the stack into registers. 29 November 2020, 22:07:28 UTC
refs/heads/abadams/alloca 3fa94ab Fix comment location 07 October 2021, 23:31:27 UTC
refs/heads/abadams/atomic_parallel_compiled_in 407d308 Compile leaf parallel loops using an internal atomic counter 06 November 2020, 20:03:09 UTC
refs/heads/abadams/averaging_tree bc10623 Merge branch 'abadams/averaging_tree' of https://github.com/halide/Halide into abadams/averaging_tree 26 April 2022, 17:38:39 UTC
refs/heads/abadams/avoid_name_mangling_in_cross_module_dependencies 13f388d Merge remote-tracking branch 'origin/master' into abadams/avoid_name_mangling_in_cross_module_dependencies 25 August 2020, 21:15:53 UTC
refs/heads/abadams/better_absd 86dfde4 typo 06 January 2022, 19:03:20 UTC
refs/heads/abadams/better_codegen_for_non_const_ramps 6721d40 Better codegen for ramps with non-const stride 20 November 2020, 22:20:41 UTC
refs/heads/abadams/bgu_cholesky 1707c0a Address review comments 01 February 2021, 18:54:47 UTC
refs/heads/abadams/braces_around_statements ecad269 Use switch statement instead of if sequence 05 October 2020, 23:56:18 UTC
refs/heads/abadams/check_reorder_dups 8cee0da Check for duplicate vars in calls to reorder/reorder_storage 23 August 2020, 21:39:07 UTC
refs/heads/abadams/cond_wait_spin 2bee115 Merge branch 'master' into abadams/cond_wait_spin 01 February 2021, 17:59:31 UTC
refs/heads/abadams/cse_in_unroll_split_tuples d1c71d0 Merge branch 'master' into abadams/cse_in_unroll_split_tuples 15 December 2021, 00:59:07 UTC
refs/heads/abadams/custom_cuda_context 0b14ec0 Comment clarifications 15 October 2021, 20:59:48 UTC
refs/heads/abadams/custom_cuda_context_2 d3df50f Clean up comments 25 October 2021, 20:41:50 UTC
refs/heads/abadams/custom_cuda_context_3 d0cdc15 Improve comments 27 October 2021, 00:51:39 UTC
refs/heads/abadams/d3d12abi 75b4f0d Rename d3d12 modules to windows_d3d12 to simplify build Also clobber invalid module flags from generic modules 14 August 2020, 17:27:21 UTC
refs/heads/abadams/depthwise_separable_conv 64fcd56 Rename some variables 25 August 2020, 19:20:51 UTC
refs/heads/abadams/diagnose_boundary_condition_failure 6b32fa2 Merge branch 'abadams/1v3_linear_comparison_cancellations' into abadams/diagnose_boundary_condition_failure 24 June 2020, 01:42:10 UTC
refs/heads/abadams/divide_using_pavgw a12b3cb Add comment elaborating on why this is a good idea 15 October 2021, 20:00:33 UTC
refs/heads/abadams/dont_reinterpret_concat 94d7f01 Don't reinterpret cast when codegenning vector concat It confuses the HVX LLVM backend, and shouldn't be necessary anyway. 02 July 2021, 17:23:21 UTC
refs/heads/abadams/early_out ec551ee Appease clang-tidy 13 June 2022, 18:38:59 UTC
refs/heads/abadams/extract_concat_bits 0457109 Fix concat_bits call 13 August 2022, 22:15:34 UTC
refs/heads/abadams/fast_integer_divide_round_to_zero f215365 Pacify clang tidy 30 November 2021, 22:02:00 UTC
refs/heads/abadams/faster_runtime_integer_division 2806116 Cleaner initialization of tables 23 November 2021, 18:44:57 UTC
refs/heads/abadams/faster_unroll 5012aba Fix computational complexity of unrolling large muxes 03 February 2021, 20:49:10 UTC
refs/heads/abadams/fix-arm-seg2 4f20718 Merge remote-tracking branch 'origin/master' into abadams/fix-arm-seg2 05 March 2021, 23:29:40 UTC
refs/heads/abadams/fix_5323 be50f8a Add --help flag to rungenmain, fixing #5323 26 October 2021, 19:47:18 UTC
refs/heads/abadams/fix_5329 72224e1 Add explicit cast to remove ambiguous operator== (Fixes #5329) 05 April 2021, 17:06:25 UTC
refs/heads/abadams/fix_5889 15abc45 Use guarded versions of vars if they exist in bounds inference 08 April 2021, 19:42:48 UTC
refs/heads/abadams/fix_autoschedule_feature_transposition 0e361d4 Fix transposed variable names 29 July 2020, 18:19:45 UTC
refs/heads/abadams/fix_cuda_mat_mul_assert f7d1a8f Merge branch 'master' into abadams/fix_cuda_mat_mul_assert 19 June 2020, 02:09:18 UTC
refs/heads/abadams/fix_deinterleave_bug 987f531 Remove buggy deinterleave misfeature 24 March 2021, 00:08:17 UTC
refs/heads/abadams/fix_deinterleave_for_reinterpret 1772c1f Minimal approach to making Deinterleave correct for Reinterpret 05 August 2022, 19:30:20 UTC
refs/heads/abadams/fix_div_round_to_zero 108dcea Add missing print 11 September 2022, 21:32:42 UTC
refs/heads/abadams/fix_fft_compile_time_regression 2a8ced8 Merge branch 'master' into abadams/fix_fft_compile_time_regression 01 December 2020, 18:46:33 UTC
refs/heads/abadams/fix_generate_output_snippets d638d81 Rename LINES to INTERESTING_LINES Some terminals treat LINES as a special var, breaking this script 23 September 2020, 19:34:55 UTC
refs/heads/abadams/fix_lgtm_warnings db22a23 Fix a few warnings from lgtm.com 21 February 2021, 03:34:46 UTC
refs/heads/abadams/fix_links_to_master 166a748 Fix some dead links to the 'master' branch 20 October 2022, 16:56:02 UTC
refs/heads/abadams/fix_potential_gpu_deadlock e3606cc Fix GPU barrier deadlocks Partition loops shouldn't mess with serial loops containing thread barriers, potentially causing warp divergence and deadlock (seen in some obscure lens blur schedules). Also we were generating too many thread barriers in a branch where the base mutator class was accidentally always mutating something, so there's a change to FuseGPUThreadLoops to make it more bug-resistant. Without these additional barriers I have been unable to come up with a case where a barrier ends up somewhere that would deadlock, so no test. 13 August 2020, 17:37:30 UTC
refs/heads/abadams/fix_realize_condition_depends_on_tuple 4a3df05 Fix bug when realize condition depends on tuple call If the realization is tuple-valued, and the condition on the realization uses a tuple call (index != 0), then the condition wasn't getting resolved during the split_tuples pass. The cause was a missing mutate call. 03 August 2022, 22:06:50 UTC
refs/heads/abadams/fix_round 5c063bb Merge branch 'main' into abadams/fix_round 26 September 2022, 18:49:16 UTC
refs/heads/abadams/fix_stencil_chain_gpu_schedule b8ad19f Schedule last stage of stencil chain on GPU too 11 August 2020, 19:12:57 UTC
refs/heads/abadams/fix_track_bounds_intervals 5091725 Rename inner version of bounds_of_expr_in_scope It's not in the explicit namespace that it's requested in (Halide::Internal), so turning on that debugging code results in compile failures. I just gave it a different name to disambiguate. 27 August 2021, 17:44:40 UTC
refs/heads/abadams/fix_tutorial_2 52ff477 Remove incorrect not-multiple-of-16 claim 20 January 2022, 16:41:42 UTC
refs/heads/abadams/fully_fused_depthwise_separable_conv 4170427 Remove dead split 02 September 2020, 22:32:24 UTC
refs/heads/abadams/gaussian_blur_app 8a92c26 Use a vectorized sum scan for the pyramid version too 08 September 2021, 00:40:04 UTC
refs/heads/abadams/gpu_autoscheduler_parallel_random_probes f8057f8 Add ability to do parallel random probes in-process 18 August 2020, 23:07:25 UTC
refs/heads/abadams/interleave_nested_vector d1deb58 Don't deinterleave all the way down to scalars 13 February 2021, 23:06:27 UTC
refs/heads/abadams/ir_match_by_ref 7a586aa Remove assert that was blowing up simplifier stack frames 03 February 2021, 03:52:24 UTC
refs/heads/abadams/lerp_plus_cast c54f4a4 Don't produce out-of-range lerp values 10 December 2021, 13:12:56 UTC
refs/heads/abadams/lower_halving_sub 429ab73 Add explanation of signed case 29 June 2022, 19:24:09 UTC
refs/heads/abadams/lower_rounding_shift_right ba47819 Non-widening lowering of rounding shifts This version lowers it without needing to widen, which is a large win on x86 for 16 and 32-bit types (3.8x faster and 2.8x faster respectively). It's a very slight slowdown for 8-bit because x86 doesn't have 8-bit shift instructions. Also drive-by typo fix. 03 May 2021, 23:51:26 UTC
refs/heads/abadams/mac-arm-fixes 92355ea Revert unintended change in precision 04 March 2021, 00:30:10 UTC
refs/heads/abadams/mixed_sign_mul_shift_right 1d07ebd Add comment 08 February 2022, 21:55:58 UTC
refs/heads/abadams/mixed_width_mul_shift_right 36b990d Merge branch 'master' into abadams/mixed_width_mul_shift_right 03 January 2022, 20:47:07 UTC
refs/heads/abadams/multiple_scatter 5b06a14 Address review comments 31 December 2020, 00:49:16 UTC
refs/heads/abadams/mux_intrinsic 913887f Add comment about out of range mux index 05 February 2021, 19:35:28 UTC
refs/heads/abadams/nested_vectorization_compile_time_regression_fix facb69d Fix for unbounded lanes 12 October 2020, 20:16:43 UTC
refs/heads/abadams/nested_vectorization_tweaks d7cf9bc Merge branch 'master' into abadams/nested_vectorization_tweaks 09 October 2020, 16:25:38 UTC
refs/heads/abadams/precompute_shared_mem_size 9903d2b Add comment explaining why we don't do dynamic tracking when no upper bound too 27 August 2020, 02:14:59 UTC
refs/heads/abadams/psabdw 38a77cb Merge remote-tracking branch 'origin/main' into abadams/psabdw 22 July 2022, 15:59:38 UTC
refs/heads/abadams/random_pipelines 1833a0b Make training binary robust to bad pipeline ids 16 October 2022, 22:05:51 UTC
refs/heads/abadams/reenable_unscheduled_stage_warning d82a456 Add Stage::unscheduled() 17 February 2022, 21:44:35 UTC
refs/heads/abadams/reinterpret_vector 7c70051 clang-format 19 December 2021, 19:40:49 UTC
refs/heads/abadams/remove_bad_pruning 21b3c85 Relax overzealous pruning rule We don't allow schedules that fuse to the extent that we can no longer vectorize. This was implemented incorrectly though. The check assumed that something was going to be compute_at inside the innermost loop, and neglected the possibility that we were about to tile that loop. 28 June 2021, 18:51:40 UTC
refs/heads/abadams/remove_readnone_on_functions 0181dd9 Revert formatting changes 07 November 2022, 22:03:38 UTC
refs/heads/abadams/reschedule_bgu 9669817 Reschedule BGU to fix performance regression BGU on CUDA had regressed from its stated performance due to the atomic floating point adds being compiled to CAS loops due to complex indexing expressions diverging on the LHS and RHS of the +=. Inlining less stuff into the += operations makes it succeed again, and the schedule was improved with a few other tweaks. Longer-term we need a first-class way to represent += so that we're not sensitive to this sort of divergence. 16 August 2020, 20:54:08 UTC
refs/heads/abadams/rounding_shift_right_use_average 357a12a Address review comments 13 December 2021, 16:37:12 UTC
refs/heads/abadams/rungenmain_error 43f94b3 Add an error message if you forget to compile RunGenMain with a registration file 17 July 2020, 20:38:21 UTC
refs/heads/abadams/sampling_profiler_overhead_v2 588de72 One line per member 23 November 2021, 21:15:23 UTC
refs/heads/abadams/simplify_correlated_pyramid 718989c Slightly more general 12 March 2021, 21:47:41 UTC
refs/heads/abadams/siotas_20 325daac Misc fixes 18 August 2021, 17:49:39 UTC
refs/heads/abadams/sioutas_20 44817ce Merge pull request #5295 from halide/abadams/fix_generate_output_snippets Rename LINES to INTERESTING_LINES 23 September 2020, 20:15:35 UTC
refs/heads/abadams/slide_over_split_loop c413e32 Merge branch 'dsharletg/sliding-window' into abadams/slide_over_split_loop 23 February 2021, 22:26:49 UTC
refs/heads/abadams/sorting_network_working_branch 9358860 codegen tweaks 08 January 2021, 01:18:19 UTC
refs/heads/abadams/switch_stmt d01bf4e Merge branch 'master' into abadams/switch_stmt 21 January 2021, 22:12:36 UTC
refs/heads/abadams/target_specific_lerp 52c13b5 Target is a struct 19 November 2021, 18:53:35 UTC
refs/heads/abadams/undo_pointless_widening 02492ca Push casts inside integer narrowing 14 February 2022, 17:10:31 UTC
refs/heads/abadams/unordered_blocks 1d9f85b Loops in between a store_at and a compute_at are ordered 04 August 2020, 23:41:45 UTC
refs/heads/abadams/unsigned_demosaic 26f6457 Merge remote-tracking branch 'origin/master' into abadams/unsigned_demosaic 11 October 2021, 21:03:06 UTC
refs/heads/abadams/use_arm_for_runtime_triple 6dd63ac How about wasm? 22 April 2021, 22:33:33 UTC
refs/heads/abadams/vector_reduce_hexagon_predicate 37a0d77 Use a VectorReduce not to determine if any lanes are true in Hexagon backend 06 May 2021, 21:16:35 UTC
refs/heads/abadams/vst_type_fix a591fbc Change 64-bit only 12 April 2022, 23:38:46 UTC
refs/heads/abadams/widening_let_bug f94dfce Just redo the comments 11 November 2021, 00:19:55 UTC
refs/heads/abadams/x86_avg 99d3795 Delete more dead code 08 October 2021, 22:06:10 UTC
refs/heads/adadams/profile_allocator 2bf474d Remove unnecessary asserts 25 February 2021, 17:57:01 UTC
refs/heads/add_image_checks_after_bounds_inference_plus_new_rules 2cce30e Delete rules that cause cycles Also move simplify_correlated_differences back to where it was, and add a handful of other rules that were in the branch. 03 February 2020, 22:24:15 UTC
refs/heads/add_outermost_to_extern 31715f8 Add outermost dim to the dim list when defining extern 27 January 2017, 22:35:07 UTC
refs/heads/add_vectorization_to_search_space b497cf1 Enable tests 18 December 2018, 19:03:41 UTC
refs/heads/align_loads_comment_fix c62bcf6 Wording improvement. 12 August 2021, 18:06:55 UTC
refs/heads/alina-strided-store 9f9a64c Merge remote-tracking branch 'origin/master' into alina-strided-store 07 December 2017, 19:16:54 UTC
refs/heads/another_buffer_copy_fix 7701abe Fix cases where halide_buffer_copy could copy to/from a host pointer that was NULL where the case was valid by compying from the device allocation. Add tests for these cases. Change name of do_multidimensional_copy in opencl and cuda runtimes to be unique to each runtime as the opencl runtime was calling the cuda do_multidimensional_copy despite both being in anonymous namespaces inside their respective files. Weak linking and C++ namespaces and our unusual runtime linking and probably at least one bug somewhere caused this to go badly. Required trying to use both cuda and opencl at the same time. 27 August 2018, 09:04:43 UTC
refs/heads/ataei-block_asserts-codegen 46f432a Remove commented experimental code 25 January 2019, 00:20:13 UTC
refs/heads/ataei-debug_info 07821c6 Print llvm -time-passes statstics when JIT or AOT compile an LLVM module 17 January 2019, 00:27:41 UTC
refs/heads/ataei-fix-pow f5120c3 Fix cuda nan_f32 value 13 June 2019, 21:12:52 UTC
refs/heads/ataei-gen_str_param 8a4f1f4 Merge branch 'master' into ataei-gen_str_param 09 April 2019, 22:31:54 UTC
refs/heads/ataei-implicit_lhs_vars 04bd712 Merge branch 'master' into ataei-implicit_lhs_vars 05 March 2019, 23:04:43 UTC
refs/heads/ataei-onnx 8c9c8d4 Update onnx_converter 26 March 2019, 22:34:40 UTC
back to top