swh:1:snp:70f530b74f5be73cfb71c212c9e3317ce44c1ebc
* Avoid redundant scope lookups This pattern has been bugging me for a long time: ``` if (scope.contains(key)) { Foo f = scope.get(key); } ``` This redundantly looks up the key in the scope twice. I've finally gotten around to fixing it. I've introduced a find method that either returns a const pointer to the value, if it exists, or null. It also searches any containing scopes, which are held by const pointer, so the method has to return a const pointer. ``` if (const Foo *f = scope.find(key)) { } ``` For cases where you want to get and then mutate, I added shallow_find, which doesn't search enclosing scopes, but returns a mutable pointer. We were also doing redundant scope lookups in ScopedBinding. We stored the key in the helper object, and then did a pop on that key in the ScopedBinding destructor. This commit changes Scope so that Scope::push returns an opaque token that you can pass to Scope::pop to have it remove that element without doing a fresh lookup. ScopedBinding now uses this. Under the hood it's just an iterator on the underlying map (map iterators are not invalidated on inserting or removing other stuff). The net effect is to speed up local laplacian lowering by about 5% I also considered making it look more like an stl class, and having find return an iterator, but it doesn't really work. The iterator it returns might point to an entry in an enclosing scope, in which case you can't compare it to the .end() method of the scope you have. Scopes are different enough from maps that the interface really needs to be distinct. * Pacify clang-tidy * Fix unintentional mutation of interval in scope * Fix accidental Scope::get * Rewrite the skip stages lowering pass Skip stages was slow due to crappy computational complexity (quadratic?) I reworked it into a two-pass linear-time algorithm. The first part remembers which pieces of IR are actually relevant to the task, and the second pass performs the task using a bounds-inference-like algorithm. On main resnet50 spends 519 ms in this pass. This commit reduces it to 40 ms. Local laplacian with 100 pyramid levels spends 7.4 seconds in this pass. This commit reduces it to ~3 ms. This commit also moves the cache store for memoized Funcs into the produce node, instead of at the top of the consume node, because it naturally places it inside a condition you inject into the produce node. * clang-tidy fixes * Fix skip stages interaction with compute_with * Unify let visitors, and use fewer stack frames for them * Fix accidental leakage of .used into .loaded * Visit the bodies of uninteresting let chains * Another used -> loaded * Fix hoist_storage not handling condition correctly. --------- Co-authored-by: Steven Johnson <srj@google.com>
- HEAD
- refs/heads/Halide_unsharp
- refs/heads/abadams/aggressive_is_single_point
- refs/heads/abadams/align_strided_const_loads
- refs/heads/abadams/alloca
- refs/heads/abadams/atomic_parallel_compiled_in
- refs/heads/abadams/atomic_vector_non_recursive
- refs/heads/abadams/averaging_tree
- refs/heads/abadams/avoid_name_mangling_in_cross_module_dependencies
- refs/heads/abadams/better_absd
- refs/heads/abadams/better_codegen_for_non_const_ramps
- refs/heads/abadams/bgu_cholesky
- refs/heads/abadams/braces_around_statements
- refs/heads/abadams/cache_tighten_producer_consumer_nodes
- refs/heads/abadams/check_reorder_dups
- refs/heads/abadams/clarify_broadcast_shuffle
- refs/heads/abadams/compositing_app
- refs/heads/abadams/cond_wait_spin
- refs/heads/abadams/cse_in_unroll_split_tuples
- refs/heads/abadams/custom_cuda_context
- refs/heads/abadams/custom_cuda_context_2
- refs/heads/abadams/custom_cuda_context_3
- refs/heads/abadams/d3d12abi
- refs/heads/abadams/deflake_mullapudi_reorder
- refs/heads/abadams/delete_prepare_for_early_exit
- refs/heads/abadams/depthwise_separable_conv
- refs/heads/abadams/diagnose_boundary_condition_failure
- refs/heads/abadams/disable_onnx_app_on_mac
- refs/heads/abadams/divide_using_pavgw
- refs/heads/abadams/dont_link_to_cudart
- refs/heads/abadams/dont_reinterpret_concat
- refs/heads/abadams/early_out
- refs/heads/abadams/enable_f16c
- refs/heads/abadams/extract_concat_bits
- refs/heads/abadams/fast_integer_divide_round_to_zero
- refs/heads/abadams/faster_runtime_integer_division
- refs/heads/abadams/faster_substitute_facts
- refs/heads/abadams/faster_unroll
- refs/heads/abadams/fix-arm-seg2
- refs/heads/abadams/fix_4211
- refs/heads/abadams/fix_5323
- refs/heads/abadams/fix_5329
- refs/heads/abadams/fix_5889
- refs/heads/abadams/fix_6984
- refs/heads/abadams/fix_7229
- refs/heads/abadams/fix_7260
- refs/heads/abadams/fix_7365
- refs/heads/abadams/fix_7374
- refs/heads/abadams/fix_7504
- refs/heads/abadams/fix_7514
- refs/heads/abadams/fix_7531
- refs/heads/abadams/fix_7584
- refs/heads/abadams/fix_7584_v2
- refs/heads/abadams/fix_7742
- refs/heads/abadams/fix_7756
- refs/heads/abadams/fix_7761
- refs/heads/abadams/fix_7768
- refs/heads/abadams/fix_7786
- refs/heads/abadams/fix_7810
- refs/heads/abadams/fix_7811
- refs/heads/abadams/fix_7815
- refs/heads/abadams/fix_7867
- refs/heads/abadams/fix_7871
- refs/heads/abadams/fix_7872
- refs/heads/abadams/fix_7873
- refs/heads/abadams/fix_7888
- refs/heads/abadams/fix_7890
- refs/heads/abadams/fix_7891
- refs/heads/abadams/fix_7892
- refs/heads/abadams/fix_7893
- refs/heads/abadams/fix_7906
- refs/heads/abadams/fix_7909
- refs/heads/abadams/fix_7968
- refs/heads/abadams/fix_8038
- refs/heads/abadams/fix_8054
- refs/heads/abadams/fix_8170
- refs/heads/abadams/fix_8184
- refs/heads/abadams/fix_arm_fcvtmp
- refs/heads/abadams/fix_autoschedule_feature_transposition
- refs/heads/abadams/fix_cse_name_collisions
- refs/heads/abadams/fix_cuda_mat_mul_assert
- refs/heads/abadams/fix_deinterleave_bug
- refs/heads/abadams/fix_deinterleave_for_reinterpret
- refs/heads/abadams/fix_div_round_to_zero
- refs/heads/abadams/fix_fft_compile_time_regression
- refs/heads/abadams/fix_generate_output_snippets
- refs/heads/abadams/fix_if_nesting_condition
- refs/heads/abadams/fix_leaks_in_memoize_test
- refs/heads/abadams/fix_lgtm_warnings
- refs/heads/abadams/fix_links_to_master
- refs/heads/abadams/fix_load_of_broadcast
- refs/heads/abadams/fix_lossless_cast_of_sub
- refs/heads/abadams/fix_onnx_app
- refs/heads/abadams/fix_pointless_lower_condition
- refs/heads/abadams/fix_potential_gpu_deadlock
- refs/heads/abadams/fix_realize_condition_depends_on_tuple
- refs/heads/abadams/fix_reduce_expr_modulo_of_vector
- refs/heads/abadams/fix_riscv_vx_vi
- refs/heads/abadams/fix_round
- refs/heads/abadams/fix_stencil_chain_gpu_schedule
- refs/heads/abadams/fix_track_bounds_intervals
- refs/heads/abadams/fix_tutorial_2
- refs/heads/abadams/fix_ub_in_lower_rounding_shift_right
- refs/heads/abadams/forward_partition_methods
- refs/heads/abadams/fully_fused_depthwise_separable_conv
- refs/heads/abadams/fuzz_sliding_window
- refs/heads/abadams/gaussian_blur_app
- refs/heads/abadams/generator_infinite_default_timeout
- refs/heads/abadams/gpu_autoscheduler_parallel_random_probes
- refs/heads/abadams/include_riscv_in_readme
- refs/heads/abadams/interleave_nested_vector
- refs/heads/abadams/ir_match_by_ref
- refs/heads/abadams/lerp_plus_cast
- refs/heads/abadams/local_laplacian_code_size
- refs/heads/abadams/lower_halving_sub
- refs/heads/abadams/lower_rounding_shift_right
- refs/heads/abadams/mac-arm-fixes
- refs/heads/abadams/make_fast_inverse_test_throughput_limited
- refs/heads/abadams/makefile_serialization_support
- refs/heads/abadams/mismatched_new_delete
- refs/heads/abadams/mixed_sign_mul_shift_right
- refs/heads/abadams/mixed_width_mul_shift_right
- refs/heads/abadams/multiple_scatter
- refs/heads/abadams/mux_intrinsic
- refs/heads/abadams/name_helpers
- refs/heads/abadams/narrow_predicates
- refs/heads/abadams/nested_vectorization_compile_time_regression_fix
- refs/heads/abadams/nested_vectorization_tweaks
- refs/heads/abadams/parallel_simd_op_check
- refs/heads/abadams/per_instance_profiling
- refs/heads/abadams/precompute_shared_mem_size
- refs/heads/abadams/prefer_no_gather
- refs/heads/abadams/print_uncaught_exception
- refs/heads/abadams/promote_fixed_point_intrinsics
- refs/heads/abadams/psabdw
- refs/heads/abadams/random_pipelines
- refs/heads/abadams/rationalize_gpu_for_loop_names
- refs/heads/abadams/reenable_unscheduled_stage_warning
- refs/heads/abadams/refactor_constant_interval
- refs/heads/abadams/reinterpret_vector
- refs/heads/abadams/remove_arch_os_for_shaders
- refs/heads/abadams/remove_bad_pruning
- refs/heads/abadams/remove_parameter_self_references
- refs/heads/abadams/remove_readnone_on_functions
- refs/heads/abadams/remove_use_of_python_config_in_onnx_makefile
- refs/heads/abadams/reschedule_bgu
- refs/heads/abadams/reschedule_bilateral_grid
- refs/heads/abadams/rewrite_atomic_pass
- refs/heads/abadams/rewrite_ir_equality
- refs/heads/abadams/rounding_shift_right_use_average
- refs/heads/abadams/rungenmain_error
- refs/heads/abadams/sampling_profiler_overhead_v2
- refs/heads/abadams/scope_improvements
- refs/heads/abadams/simpler_broadcasts
- refs/heads/abadams/simplify_correlated_pyramid
- refs/heads/abadams/siotas_20
- refs/heads/abadams/sioutas_20
- refs/heads/abadams/slide_over_split_loop
- refs/heads/abadams/sorting_network_working_branch
- refs/heads/abadams/stable_topological_order
- refs/heads/abadams/string_view
- refs/heads/abadams/strip_asserts_last
- refs/heads/abadams/switch_stmt
- refs/heads/abadams/target_specific_lerp
- refs/heads/abadams/time_lowering_passes
- refs/heads/abadams/track_failedness_through_solver_lets
- refs/heads/abadams/turn_off_slp_vectorization_for_avx512
- refs/heads/abadams/tweak_unpack_buffers
- refs/heads/abadams/undo_pointless_widening
- refs/heads/abadams/unordered_blocks
- refs/heads/abadams/unsigned_demosaic
- refs/heads/abadams/update_makefile_for_llvm_19
- refs/heads/abadams/use_arm_for_runtime_triple
- refs/heads/abadams/use_pmaddubsw_for_downsample
- refs/heads/abadams/validate_gpu_schedules
- refs/heads/abadams/vector_reduce_hexagon_predicate
- refs/heads/abadams/vector_scan
- refs/heads/abadams/vst_type_fix
- refs/heads/abadams/widening_let_bug
- refs/heads/abadams/x86_avg
- refs/heads/abadams/zen4
- refs/heads/adadams/profile_allocator
- refs/heads/add_image_checks_after_bounds_inference_plus_new_rules
- refs/heads/add_outermost_to_extern
- refs/heads/add_vectorization_to_search_space
- refs/heads/aelphy/feature_cadence_changes
- refs/heads/aelphy/float_extracts
- refs/heads/align_loads_comment_fix
- refs/heads/alina-strided-store
- refs/heads/another_buffer_copy_fix
- refs/heads/arm_sve_redux
- refs/heads/ataei-block_asserts-codegen
- refs/heads/ataei-debug_info
- refs/heads/ataei-fix-pow
- refs/heads/ataei-gen_str_param
- refs/heads/ataei-implicit_lhs_vars
- refs/heads/ataei-onnx
- refs/heads/ataei-onnx_converter_update
- refs/heads/ataei-onnx_pybind
- refs/heads/ataei-resnet50_benchmarks
- refs/heads/ataei-standalone_autoscheduler
- refs/heads/ataei_lots_of_inputs
- refs/heads/auto_sched_benchmarks
- refs/heads/auto_sched_estimates
- refs/heads/auto_sched_inline
- refs/heads/auto_sched_test_notparallel
- refs/heads/autoschedule_top_down
- refs/heads/autoschedule_with_convnet
- refs/heads/autoscheduler_scalar_imageparam_fix
- refs/heads/backports/10.x
- refs/heads/backports/11.x
- refs/heads/backports/12.x
- refs/heads/backports/13.x
- refs/heads/balance_expressions
- refs/heads/bazel
- refs/heads/benchmarks
- refs/heads/blaze
- refs/heads/bounds_buffer_lets_fix
- refs/heads/bounds_correct_vs_bounds_loaded_reduced
- refs/heads/buffer_device_api_target
- refs/heads/bug_device_free
- refs/heads/bug_inline_unbounded
- refs/heads/build/fix-xcode-2
- refs/heads/build/manylinux-fixes
- refs/heads/circ_buffer
- refs/heads/cmake-no-runtime-debug-symbols
- refs/heads/cmake/asan
- refs/heads/cmake/deps-cleanup
- refs/heads/cmake/find-modules
- refs/heads/cmake/spirv
- refs/heads/cmake_wasm_features
- refs/heads/compute_at_guard_with_if_goes_on_stack
- refs/heads/compute_with_at
- refs/heads/compute_with_check
- refs/heads/compute_with_excessive_bounds
- refs/heads/compute_with_inlined
- refs/heads/compute_with_remove_is_right_level
- refs/heads/cpack/nuget
- refs/heads/ctest/wrappers
- refs/heads/cuda-constant
- refs/heads/d3d12-allocation-cache
- refs/heads/deferred_cse_after_inlining
- refs/heads/destructor_calls_deinit
- refs/heads/dg/deserialize_unmapped_objects
- refs/heads/dg/fix_vulkan_codegen_bool_conversion
- refs/heads/dg/vulkan_conform_api
- refs/heads/dg/vulkan_region_allocator_fixes
- refs/heads/dgerstmann/fix-vulkan-memory-config-init
- refs/heads/disable_acquire_release_test_vulkan
- refs/heads/distinct_wrapper_names
- refs/heads/dkg/6863_asan_fixes
- refs/heads/dkg/vulkan
- refs/heads/dpalermo_dmabuf
- refs/heads/dpalermo_dmabuf_libion
- refs/heads/dpalermo_hexagon_remote_202003
- refs/heads/dpalermo_sdk4_2_0_2
- refs/heads/ds/buffer-get-pure
- refs/heads/ds/opt-tile-size
- refs/heads/ds/tail-none
- refs/heads/ds/while
- refs/heads/dsharletg/bitwise-intrinsics
- refs/heads/dsharletg/find-vector-reduce
- refs/heads/dsharletg/jit-optimization
- refs/heads/dsharletg/memcpy-copy_from
- refs/heads/dsharletg/pattern-headroom
- refs/heads/dsharletg/refactor-host-alignment
- refs/heads/dsharletg/runtime-size
- refs/heads/dsharletg/simplify-abs
- refs/heads/dsharletg/simplify-type-bounds
- refs/heads/dsharletg/specialize-bounds
- refs/heads/dsharletg/upsample-channels
- refs/heads/empty_prefetch
- refs/heads/emscripten_vector_fix
- refs/heads/export_all-wsmoses
- refs/heads/expr_auto_sched
- refs/heads/extern_bugs
- refs/heads/extern_host_alloc
- refs/heads/factor_parallel_codegen_hack
- refs/heads/fast_sync_tsan
- refs/heads/faster_integer_division
- refs/heads/feature/apps-external
- refs/heads/feature/cmake-presets
- refs/heads/feature/convert
- refs/heads/feature/f16_interleave
- refs/heads/feature/gather_load_q7
- refs/heads/feature/llvm-codemodel
- refs/heads/feature/load_predicated
- refs/heads/feature/luma_regression
- refs/heads/feature/maintanence
- refs/heads/feature/reinterprets
- refs/heads/feature/tcm_bump_allocator
- refs/heads/feature/xtensa_fix_interleave_q8
- refs/heads/feature/xtensa_q8_tests
- refs/heads/find_intrinsics_issue
- refs/heads/find_intrinsics_widening_lets
- refs/heads/fix-floated-pure-stage
- refs/heads/fix-race-condition
- refs/heads/fix_hexagon_alignment
- refs/heads/fix_hvx_intrinsics
- refs/heads/fix_prefetch_test
- refs/heads/fix_windows_vs15_build
- refs/heads/fixed_length_vectors
- refs/heads/fixed_point_local_laplac
- refs/heads/gemmlowp
- refs/heads/generate
- refs/heads/gha/pip
- refs/heads/gpu_canon_fix
- refs/heads/halide_ir_flatbuffer
- refs/heads/hex_dma2_async
- refs/heads/hexagon_le_runtime
- refs/heads/hexagon_priority
- refs/heads/hexagon_setpriority
- refs/heads/hexagon_strided_pred_load
- refs/heads/hexagon_sysmon_markers
- refs/heads/imaging-synthesis
- refs/heads/includes_fix
- refs/heads/ios_fast_sync_fix
- refs/heads/jia-kai-fix-runtime-cuda-init
- refs/heads/kamil-openglcompute-infinity
- refs/heads/kamil/name_pthread_workers
- refs/heads/kp_bit_shift
- refs/heads/line_buffer
- refs/heads/loop_carry_not_working
- refs/heads/lower_on_huge_stack
- refs/heads/main
- refs/heads/master
- refs/heads/memoize_with_extents
- refs/heads/metal_float16
- refs/heads/metaprogrammed_simplifier_mod
- refs/heads/mohamedadaly-vmlal
- refs/heads/more_powerful_sliding
- refs/heads/new_autoschedule_with_new_simplifier_arm_worker_branch
- refs/heads/new_autoscheduler
- refs/heads/new_simplifier_rule_testing
- refs/heads/newer_ion_ioctl
- refs/heads/no_bounds_query_when_bounds_used
- refs/heads/opengl_compute_buffer_types_fix
- refs/heads/openglcompute_reuse_shared_allocations
- refs/heads/optmize_reorder
- refs/heads/par_for_opt
- refs/heads/pdb/fix_7806
- refs/heads/pdb/hexagon_remote_cmake
- refs/heads/pdb_add_libcpp_makefile_inc
- refs/heads/pdb_eliminate_interleaves_test
- refs/heads/pdb_fix_clang_build
- refs/heads/pdb_fix_install_qc
- refs/heads/pdb_fix_loop_carry
- refs/heads/pdb_fix_simd_op_check_hvx
- refs/heads/pdb_mul_div_mod_multi_thread
- refs/heads/pdb_remove_hvx_v64
- refs/heads/perform_inline_with_order
- refs/heads/pr/2572
- refs/heads/pr/2676
- refs/heads/pr/2975
- refs/heads/pr/3017
- refs/heads/pr/3081
- refs/heads/pr/3387
- refs/heads/pr/3939
- refs/heads/pr/3960
- refs/heads/pr/4380
- refs/heads/pr/4414
- refs/heads/pr/5331
- refs/heads/pr/5438
- refs/heads/pr/5455
- refs/heads/pr/5758_2
- refs/heads/predicated_vector
- refs/heads/prefetch_specialize
- refs/heads/print_schedule
- refs/heads/profile_hardware_counters
- refs/heads/random-pipelines
- refs/heads/rdom_with_pure_vars
- refs/heads/readme-fix-gcd
- refs/heads/realization_order
- refs/heads/refactor_module
- refs/heads/register_promotion
- refs/heads/release/10.x
- refs/heads/release/11.x
- refs/heads/release/12.x
- refs/heads/release/13.x
- refs/heads/release/14.x
- refs/heads/release/15.x
- refs/heads/release/16.x
- refs/heads/release/17.x
- refs/heads/release/8.x
- refs/heads/remove_max_on_fuse_factor
- refs/heads/reorder_rvar
- refs/heads/reset_unique_counter
- refs/heads/revert-3612-ataei-speedup_compiletime
- refs/heads/revert-7009-rootjalex/distribute-w_shl
- refs/heads/revert-7601-compile_hexagon_remote
- refs/heads/riscv_update
- refs/heads/rl_simplifier_rules
- refs/heads/rootjalex/add_simpl_rules
- refs/heads/rootjalex/arm-optimize
- refs/heads/rootjalex/autoscheduler_mcts
- refs/heads/rootjalex/bounds-rewriter
- refs/heads/rootjalex/bounds_synthesis
- refs/heads/rootjalex/cbounds
- refs/heads/rootjalex/cbounds_predicated
- refs/heads/rootjalex/fix-sat-overflow
- refs/heads/rootjalex/fix_estimate_issue
- refs/heads/rootjalex/fix_failed_unrolls
- refs/heads/rootjalex/gsoc_codegen
- refs/heads/rootjalex/improve_cbounds_fixed
- refs/heads/rootjalex/improve_constant_bounds
- refs/heads/rootjalex/pitchfork-arm
- refs/heads/rootjalex/reinterpret-simplify
- refs/heads/rootjalex/rts
- refs/heads/rootjalex/super_simplify_bounds
- refs/heads/rootjalex/test_cbounds_fixed
- refs/heads/rootjalex/test_constant_bounds
- refs/heads/rootjalex/trs-codegen
- refs/heads/rootjalex/trs-codegen-cross
- refs/heads/rootjalex/trs-merge
- refs/heads/rootjalex/uint32-int32-cast
- refs/heads/rootjalex/x86-hadds
- refs/heads/rootjalex/x86-optimize
- refs/heads/rootjalex/x86-optimize-test
- refs/heads/rootjalex/x86-sat
- refs/heads/rootjalex/x86-test
- refs/heads/rule_removal_experiments
- refs/heads/schedule-output-storage
- refs/heads/separate_bounds_query_entrypoint
- refs/heads/shallow
- refs/heads/shift_amount_type_change
- refs/heads/shoaibkamil/cmake-without-arm
- refs/heads/shoaibkamil/correct_memory_fences
- refs/heads/shoaibkamil/d3d-fixes
- refs/heads/shoaibkamil/deprecate_openglcompute
- refs/heads/shoaibkamil/json
- refs/heads/shoaibkamil/llvm_clone_tag
- refs/heads/shoaibkamil/minor-vcpkg-doc-change
- refs/heads/shoaibkamil/opengl_compute_tests
- refs/heads/shoaibkamil/performance_tests_as_generators
- refs/heads/shoaibkamil/rule_removal_experiments
- refs/heads/shoaibkamil/super_simplify_with_interpreter
- refs/heads/shoaibkamil/windows-arm-fix-attributes
- refs/heads/sim_shlib_addr_print
- refs/heads/simplify-nested-broadcasts
- refs/heads/simplify-vectorreduce-shuffles2
- refs/heads/simplify_mod
- refs/heads/sioutas_2020
- refs/heads/sioutas_2020_autoscheduler
- refs/heads/slomp/gpu-codegen-profiling
- refs/heads/slomp/msvc-static-analysis
- refs/heads/solve_div
- refs/heads/solve_div_master
- refs/heads/solve_div_simplifier_test
- refs/heads/sr/python-late-binding-defaults
- refs/heads/srj-aaa
- refs/heads/srj-alloc
- refs/heads/srj-alloca
- refs/heads/srj-appmake2
- refs/heads/srj-armv83a
- refs/heads/srj-aslog
- refs/heads/srj-assert
- refs/heads/srj-assoc
- refs/heads/srj-auto-multi
- refs/heads/srj-auto-multi2
- refs/heads/srj-auto_schedule_mat_mul
- refs/heads/srj-autosched
- refs/heads/srj-b2cpphide
- refs/heads/srj-barr
- refs/heads/srj-bits
- refs/heads/srj-blacklist
- refs/heads/srj-bounds
- refs/heads/srj-bufcalltype
- refs/heads/srj-bufcallwrap
- refs/heads/srj-bufcallwrap2
- refs/heads/srj-buffer
- refs/heads/srj-bv
- refs/heads/srj-classic-autotune
- refs/heads/srj-clean
- refs/heads/srj-constcall
- refs/heads/srj-crosscompile
- refs/heads/srj-ctlz
- refs/heads/srj-cvec-patch
- refs/heads/srj-dag
- refs/heads/srj-debug-to-file
- refs/heads/srj-deir
- refs/heads/srj-f16
- refs/heads/srj-fp16
- refs/heads/srj-fsch
- refs/heads/srj-fthru
- refs/heads/srj-g2
- refs/heads/srj-g3
- refs/heads/srj-gha-test-fixes
- refs/heads/srj-hidden
- refs/heads/srj-hide2
- refs/heads/srj-hvx
- refs/heads/srj-hvx-bug
- refs/heads/srj-hvx-codegen-bug
- refs/heads/srj-hvx-nocopy
- refs/heads/srj-hvxshift
- refs/heads/srj-iib
- refs/heads/srj-initshape
- refs/heads/srj-inv
- refs/heads/srj-ir
- refs/heads/srj-irmut2
- refs/heads/srj-iwyu
- refs/heads/srj-iwyu3
- refs/heads/srj-javascript_work_in_progress
- refs/heads/srj-lensblur
- refs/heads/srj-lessinc
- refs/heads/srj-llvm-loop-opt
- refs/heads/srj-mak
- refs/heads/srj-maxthreads
- refs/heads/srj-mod
- refs/heads/srj-msan
- refs/heads/srj-msan-call
- refs/heads/srj-muldivmod
- refs/heads/srj-mut
- refs/heads/srj-outputs-2
- refs/heads/srj-parse
- refs/heads/srj-pch
- refs/heads/srj-printfunc
- refs/heads/srj-pygp
- refs/heads/srj-revertbits
- refs/heads/srj-schedule-storage
- refs/heads/srj-shl-shr-2
- refs/heads/srj-sio
- refs/heads/srj-static-const
- refs/heads/srj-strided-store
- refs/heads/srj-tidyh
- refs/heads/srj-tiff
- refs/heads/srj-trace
- refs/heads/srj-tutorial
- refs/heads/srj-using
- refs/heads/srj-wasmfix
- refs/heads/srj-xor2
- refs/heads/srj/abstract-gen-without-get-output-func-KEEP
- refs/heads/srj/aligned-alloc
- refs/heads/srj/aligned-alloc-2
- refs/heads/srj/aligned-malloc-with-aligned-alloc
- refs/heads/srj/all-explicit-ctor
- refs/heads/srj/anderson-thread-info-ptr
- refs/heads/srj/aot-perf
- refs/heads/srj/apps-hamal
- refs/heads/srj/argv-signatures
- refs/heads/srj/argv-types
- refs/heads/srj/async-test
- refs/heads/srj/b2cpp-const-data
- refs/heads/srj/better-xt-dispatch
- refs/heads/srj/bfloat1
- refs/heads/srj/bp
- refs/heads/srj/build_halide_h
- refs/heads/srj/c-bool
- refs/heads/srj/cache-clear
- refs/heads/srj/clang-fmt-ignore
- refs/heads/srj/clang-tidy
- refs/heads/srj/clear-c-cache
- refs/heads/srj/cmake-asan
- refs/heads/srj/cmake-asan2
- refs/heads/srj/cmake-jit-generators
- refs/heads/srj/configure-cmake
- refs/heads/srj/cpp-generator-v2-experiment-KEEP
- refs/heads/srj/crosscompile
- refs/heads/srj/csv
- refs/heads/srj/ctad
- refs/heads/srj/debug-to-file-api
- refs/heads/srj/depr
- refs/heads/srj/deprecation
- refs/heads/srj/device-copy
- refs/heads/srj/example
- refs/heads/srj/experiment
- refs/heads/srj/experiment-6967
- refs/heads/srj/exporting
- refs/heads/srj/expr_t
- refs/heads/srj/external-tensors
- refs/heads/srj/f16-convert
- refs/heads/srj/fix-pytorch
- refs/heads/srj/fixed-rollback
- refs/heads/srj/fopen-fix
- refs/heads/srj/forward
- refs/heads/srj/forward-name
- refs/heads/srj/gen-func
- refs/heads/srj/gen-func-2
- refs/heads/srj/gen-func-3
- refs/heads/srj/gen2-1
- refs/heads/srj/gen_closure
- refs/heads/srj/generator_aot_gpu_multi_context_threaded
- refs/heads/srj/globals
- refs/heads/srj/halide-buffer-crop
- refs/heads/srj/halide-malloc-alignment
- refs/heads/srj/halide-must-use
- refs/heads/srj/halide-runtime-must-use-result
- refs/heads/srj/hang-repro
- refs/heads/srj/hannk
- refs/heads/srj/hannk-aliasing
- refs/heads/srj/hannk-error-checking
- refs/heads/srj/hannk-errors
- refs/heads/srj/hannk-inplace
- refs/heads/srj/hannk-mmap
- refs/heads/srj/hannk-tflite-27
- refs/heads/srj/hannk-verbosity
- refs/heads/srj/hdrs
- refs/heads/srj/html-becomes-viz
- refs/heads/srj/implicit-mult-widening
- refs/heads/srj/issue-7076
- refs/heads/srj/iwyu
- refs/heads/srj/iwyu-2
- refs/heads/srj/iwyu-6
- refs/heads/srj/libHANNK
- refs/heads/srj/llvm_type_of
- refs/heads/srj/maybe-unused
- refs/heads/srj/meanop
- refs/heads/srj/metadata-calling-convention
- refs/heads/srj/more-tidy
- refs/heads/srj/msan-dtf
- refs/heads/srj/multimeta
- refs/heads/srj/nanobind
- refs/heads/srj/new-rt-1
- refs/heads/srj/no-threadpool
- refs/heads/srj/no-timeout-thread
- refs/heads/srj/oglc-mutexed
- refs/heads/srj/param-map
- refs/heads/srj/pip-15.x
- refs/heads/srj/pip-cron
- refs/heads/srj/possible-uninited
- refs/heads/srj/pr-7566
- refs/heads/srj/printer-size
- refs/heads/srj/profiler-data-race
- refs/heads/srj/ptr-int-cast
- refs/heads/srj/pyapps
- refs/heads/srj/pyext-fix
- refs/heads/srj/pygen-class
- refs/heads/srj/pygen-deux
- refs/heads/srj/pygen-func
- refs/heads/srj/pygen-native-types
- refs/heads/srj/pyinstall
- refs/heads/srj/pypi-try
- refs/heads/srj/pystuff
- refs/heads/srj/python-buffer-unpack
- refs/heads/srj/python-tutorial
- refs/heads/srj/reshape
- refs/heads/srj/rt-error-smallify
- refs/heads/srj/rt-return-types
- refs/heads/srj/runtime-error-handling
- refs/heads/srj/sat-fixes-exp
- refs/heads/srj/sat-fixes-exp-2
- refs/heads/srj/shadow-field
- refs/heads/srj/snprintf
- refs/heads/srj/spirv-license
- refs/heads/srj/stat-buf-deprecations
- refs/heads/srj/static-buffer-generators
- refs/heads/srj/stmt-html
- refs/heads/srj/stringify
- refs/heads/srj/synth-gen-params
- refs/heads/srj/synth-params-python
- refs/heads/srj/test-arm_sve_redux
- refs/heads/srj/test-intrinsics-bounds
- refs/heads/srj/test8076
- refs/heads/srj/test8078
- refs/heads/srj/test8094
- refs/heads/srj/test8105a
- refs/heads/srj/test8115
- refs/heads/srj/test_tmpdir_fix
- refs/heads/srj/tidy
- refs/heads/srj/tidy-format-14
- refs/heads/srj/tidymore
- refs/heads/srj/tidymore2
- refs/heads/srj/tls
- refs/heads/srj/tls-3
- refs/heads/srj/tls-4
- refs/heads/srj/tls-ucon
- refs/heads/srj/tmp-unschedule-experiment
- refs/heads/srj/tot-fix
- refs/heads/srj/try-revert-sat
- refs/heads/srj/type-traits
- refs/heads/srj/typed-func
- refs/heads/srj/ucon-all-const
- refs/heads/srj/ucon-non-const
- refs/heads/srj/visit-warnings
- refs/heads/srj/wasm-atomic2
- refs/heads/srj/wasm-simd
- refs/heads/srj/wasm-stuff
- refs/heads/srj/wasm-threads
- refs/heads/srj/wasm-updates
- refs/heads/srj/wasm-work
- refs/heads/srj/wip
- refs/heads/srj/x-rounding
- refs/heads/srj/xbuf
- refs/heads/srj/xc+plus+size+tmp
- refs/heads/srj/xc-types
- refs/heads/srj/xt-uint-cast-test
- refs/heads/srj/xtensa-arch
- refs/heads/srj/xtensa-merge
- refs/heads/srj/xvc-experimetn
- refs/heads/srj/zlib-embed
- refs/heads/standalone_autoscheduler
- refs/heads/standalone_autoscheduler_arm_worker
- refs/heads/standalone_autoscheduler_arm_worker_amazon
- refs/heads/standalone_autoscheduler_gpu
- refs/heads/standalone_autoscheduler_hexagon
- refs/heads/sticky_task_assignments
- refs/heads/store_with
- refs/heads/store_with_solver_for_super_simplify
- refs/heads/strict_float_cse_fix
- refs/heads/super_simplify
- refs/heads/super_simplify_v2
- refs/heads/super_simplify_v3
- refs/heads/transitive_wrapper
- refs/heads/trigger-release-v16
- refs/heads/tzumao-autodiff-boundarycond
- refs/heads/tzumao-gradient-autoscheduler-bug
- refs/heads/tzumao-predicate-store-load
- refs/heads/tzumao-python-buffer
- refs/heads/tzumao_autodiff_unbounded
- refs/heads/tzumao_improve_gradient_autoscheduler
- refs/heads/tzumao_issue_4297
- refs/heads/tzumao_licm_before_BI
- refs/heads/unbounded_bugs
- refs/heads/undo_async_copy_chain_black_list
- refs/heads/use_string_literals_for_blobs
- refs/heads/users/lukas/python-pip
- refs/heads/validate_sched_error_msg
- refs/heads/var_ir_fix
- refs/heads/vksnk/async-experiment
- refs/heads/vksnk/async-multiple-producers
- refs/heads/vksnk/async-order
- refs/heads/vksnk/better-loop-carry
- refs/heads/vksnk/better-message
- refs/heads/vksnk/bound-storage
- refs/heads/vksnk/bounds-widen-right
- refs/heads/vksnk/c-print-type
- refs/heads/vksnk/c-round
- refs/heads/vksnk/check-return-result
- refs/heads/vksnk/compute-with-bug
- refs/heads/vksnk/compute_with_async
- refs/heads/vksnk/dma-limit-channels
- refs/heads/vksnk/dma-min-max
- refs/heads/vksnk/expr-match-shuffle
- refs/heads/vksnk/extract-from-scalar
- refs/heads/vksnk/f16-load
- refs/heads/vksnk/fix-packvr
- refs/heads/vksnk/fix_halide_xtensa_narrow_with_rounding_shift_i16
- refs/heads/vksnk/fused-compute-with
- refs/heads/vksnk/hoist-storage-bug
- refs/heads/vksnk/lerp-intrinsics
- refs/heads/vksnk/lower-signed-shifts
- refs/heads/vksnk/missing-exception
- refs/heads/vksnk/non-widening-halves
- refs/heads/vksnk/optimize-shuffles
- refs/heads/vksnk/replace-all
- refs/heads/vksnk/restrict
- refs/heads/vksnk/roll-buffer
- refs/heads/vksnk/roundeven-arm
- refs/heads/vksnk/rvar-bounds
- refs/heads/vksnk/simplify-slice
- refs/heads/vksnk/skip-semaphores
- refs/heads/vksnk/storage-folding
- refs/heads/vksnk/strided-load-of-4_2
- refs/heads/vksnk/typed-scope
- refs/heads/vksnk/update-simd-driver
- refs/heads/vksnk/vectorize-bug
- refs/heads/vksnk/vectorize-scalarize
- refs/heads/vksnk/widening_absd
- refs/heads/vksnk/xtensa-codegen-fp16
- refs/heads/vksnk/xtensa-dma-improvements
- refs/heads/vksnk/xtensa-regroup-pass
- refs/heads/vksnk/xtensa/lift-allocs
- refs/heads/vulkan
- refs/heads/vulkan-diagnose-alloc-failures
- refs/heads/vulkan-phase0-adts
- refs/heads/vulkan-phase1-spirv
- refs/heads/vulkan-phase2-runtime
- refs/heads/vulkan2
- refs/heads/vulkan_fix_gpu_dynamic_shared_test
- refs/heads/vulkan_fix_subregion_memory_offsets
- refs/heads/webassembly-old
- refs/heads/winograd
- refs/heads/wording_fix
- refs/heads/xtensa-codegen
- refs/heads/xtensa-codegen-parallel
- refs/heads/xuanda/fix-serialize-bad-partition-always
- refs/remotes/origin/rootjalex/add_autosched_caching
- refs/tags/release_2018_02_15
- refs/tags/release_2019_08_27
- refs/tags/release_8.0.0
- refs/tags/v10.0.0
- refs/tags/v10.0.1
- refs/tags/v11.0.0
- refs/tags/v11.0.1
- refs/tags/v12.0.0
- refs/tags/v12.0.1
- refs/tags/v13.0.0
- refs/tags/v13.0.1
- refs/tags/v13.0.2
- refs/tags/v13.0.3
- refs/tags/v13.0.4
- refs/tags/v14.0.0
- refs/tags/v15.0.0
- refs/tags/v15.0.1
- refs/tags/v16.0.0
- refs/tags/v17.0.0
- refs/tags/v17.0.1
- refs/tags/v8.0.0
Cook and download a directory from the Software Heritage Vault
You have requested the cooking of the directory with identifier swh:1:dir:4539670ef881d603af75df29c8f784b5dd95bd99 into a standard tar.gz archive
.
Are you sure you want to continue ?
Download a directory from the Software Heritage Vault
You have requested the download of the directory with identifier swh:1:dir:4539670ef881d603af75df29c8f784b5dd95bd99 as a standard tar.gz archive
.
Are you sure you want to continue ?
Cook and download a revision from the Software Heritage Vault
You have requested the cooking of the history heading to revision with identifier swh:1:rev:36d74a8cbf9c4129f608cd97d231961f1bd99c4c into a bare git archive
.
Are you sure you want to continue ?
Download a revision from the Software Heritage Vault
You have requested the download of the history heading to revision with identifier swh:1:rev:36d74a8cbf9c4129f608cd97d231961f1bd99c4c as a bare git archive
.
Are you sure you want to continue ?
Invalid Email !
The provided email is not well-formed.
Download link has expired
The requested archive is no longer available for download from the Software Heritage Vault.
Do you want to cook it again ?
To reference or cite the objects present in the Software Heritage archive, permalinks based on SoftWare Hash IDentifiers (SWHIDs) must be used.
Select below a type of object currently browsed in order to display its associated SWHID and permalink.
Rewrite IREquality to use a more compact stack instead of deep recursion (#8198)
File | Mode | Size |
---|---|---|
.github | ||
apps | ||
cmake | ||
dependencies | ||
doc | ||
packaging | ||
python_bindings | ||
src | ||
test | ||
tools | ||
tutorial | ||
util | ||
.clang-format | -rw-r--r-- | 1.4 KB |
.clang-format-ignore | -rw-r--r-- | 383 bytes |
.clang-tidy | -rw-r--r-- | 7.6 KB |
.gitattributes | -rw-r--r-- | 342 bytes |
.gitignore | -rw-r--r-- | 4.9 KB |
.gitmodules | -rw-r--r-- | 0 bytes |
CMakeLists.txt | -rw-r--r-- | 10.9 KB |
CMakePresets.json | -rw-r--r-- | 6.8 KB |
CODE_OF_CONDUCT.md | -rw-r--r-- | 3.5 KB |
LICENSE.txt | -rw-r--r-- | 14.4 KB |
MANIFEST.in | -rw-r--r-- | 159 bytes |
Makefile | -rw-r--r-- | 105.7 KB |
README.md | -rw-r--r-- | 16.5 KB |
README_cmake.md | -rw-r--r-- | 77.0 KB |
README_fuzz_testing.md | -rw-r--r-- | 3.9 KB |
README_python.md | -rw-r--r-- | 31.8 KB |
README_rungen.md | -rw-r--r-- | 12.1 KB |
README_vulkan.md | -rw-r--r-- | 11.4 KB |
README_webassembly.md | -rw-r--r-- | 10.4 KB |
README_webgpu.md | -rw-r--r-- | 5.2 KB |
pyproject.toml | -rw-r--r-- | 196 bytes |
requirements.txt | -rw-r--r-- | 130 bytes |
run-clang-format.sh | -rwxr-xr-x | 1.4 KB |
run-clang-tidy.sh | -rwxr-xr-x | 3.8 KB |
setup.py | -rw-r--r-- | 1.2 KB |
Computing file changes ...