https://github.com/halide/Halide
- HEAD
- refs/heads/Halide_unsharp
- refs/heads/abadams/align_strided_const_loads
- refs/heads/abadams/alloca
- refs/heads/abadams/atomic_parallel_compiled_in
- refs/heads/abadams/atomic_vector_non_recursive
- refs/heads/abadams/averaging_tree
- refs/heads/abadams/avoid_name_mangling_in_cross_module_dependencies
- refs/heads/abadams/better_absd
- refs/heads/abadams/better_codegen_for_non_const_ramps
- refs/heads/abadams/bgu_cholesky
- refs/heads/abadams/braces_around_statements
- refs/heads/abadams/cache_tighten_producer_consumer_nodes
- refs/heads/abadams/check_reorder_dups
- refs/heads/abadams/clarify_broadcast_shuffle
- refs/heads/abadams/compositing_app
- refs/heads/abadams/cond_wait_spin
- refs/heads/abadams/cse_in_unroll_split_tuples
- refs/heads/abadams/custom_cuda_context
- refs/heads/abadams/custom_cuda_context_2
- refs/heads/abadams/custom_cuda_context_3
- refs/heads/abadams/d3d12abi
- refs/heads/abadams/deflake_mullapudi_reorder
- refs/heads/abadams/delete_prepare_for_early_exit
- refs/heads/abadams/depthwise_separable_conv
- refs/heads/abadams/diagnose_boundary_condition_failure
- refs/heads/abadams/disable_onnx_app_on_mac
- refs/heads/abadams/divide_using_pavgw
- refs/heads/abadams/dont_link_to_cudart
- refs/heads/abadams/dont_reinterpret_concat
- refs/heads/abadams/early_out
- refs/heads/abadams/enable_f16c
- refs/heads/abadams/extract_concat_bits
- refs/heads/abadams/fast_integer_divide_round_to_zero
- refs/heads/abadams/faster_runtime_integer_division
- refs/heads/abadams/faster_unroll
- refs/heads/abadams/fix-arm-seg2
- refs/heads/abadams/fix_4211
- refs/heads/abadams/fix_5323
- refs/heads/abadams/fix_5329
- refs/heads/abadams/fix_5889
- refs/heads/abadams/fix_6984
- refs/heads/abadams/fix_7229
- refs/heads/abadams/fix_7260
- refs/heads/abadams/fix_7365
- refs/heads/abadams/fix_7374
- refs/heads/abadams/fix_7504
- refs/heads/abadams/fix_7514
- refs/heads/abadams/fix_7531
- refs/heads/abadams/fix_7584
- refs/heads/abadams/fix_7584_v2
- refs/heads/abadams/fix_7742
- refs/heads/abadams/fix_7756
- refs/heads/abadams/fix_7761
- refs/heads/abadams/fix_7768
- refs/heads/abadams/fix_7786
- refs/heads/abadams/fix_7810
- refs/heads/abadams/fix_7811
- refs/heads/abadams/fix_7815
- refs/heads/abadams/fix_7867
- refs/heads/abadams/fix_7871
- refs/heads/abadams/fix_7872
- refs/heads/abadams/fix_7873
- refs/heads/abadams/fix_7888
- refs/heads/abadams/fix_7890
- refs/heads/abadams/fix_7891
- refs/heads/abadams/fix_7892
- refs/heads/abadams/fix_7893
- refs/heads/abadams/fix_7906
- refs/heads/abadams/fix_7909
- refs/heads/abadams/fix_7968
- refs/heads/abadams/fix_8038
- refs/heads/abadams/fix_8054
- refs/heads/abadams/fix_arm_fcvtmp
- refs/heads/abadams/fix_autoschedule_feature_transposition
- refs/heads/abadams/fix_cse_name_collisions
- refs/heads/abadams/fix_cuda_mat_mul_assert
- refs/heads/abadams/fix_deinterleave_bug
- refs/heads/abadams/fix_deinterleave_for_reinterpret
- refs/heads/abadams/fix_div_round_to_zero
- refs/heads/abadams/fix_fft_compile_time_regression
- refs/heads/abadams/fix_generate_output_snippets
- refs/heads/abadams/fix_if_nesting_condition
- refs/heads/abadams/fix_leaks_in_memoize_test
- refs/heads/abadams/fix_lgtm_warnings
- refs/heads/abadams/fix_links_to_master
- refs/heads/abadams/fix_load_of_broadcast
- refs/heads/abadams/fix_lossless_cast_of_sub
- refs/heads/abadams/fix_onnx_app
- refs/heads/abadams/fix_pointless_lower_condition
- refs/heads/abadams/fix_potential_gpu_deadlock
- refs/heads/abadams/fix_realize_condition_depends_on_tuple
- refs/heads/abadams/fix_reduce_expr_modulo_of_vector
- refs/heads/abadams/fix_riscv_vx_vi
- refs/heads/abadams/fix_round
- refs/heads/abadams/fix_stencil_chain_gpu_schedule
- refs/heads/abadams/fix_track_bounds_intervals
- refs/heads/abadams/fix_tutorial_2
- refs/heads/abadams/forward_partition_methods
- refs/heads/abadams/fully_fused_depthwise_separable_conv
- refs/heads/abadams/fuzz_sliding_window
- refs/heads/abadams/gaussian_blur_app
- refs/heads/abadams/generator_infinite_default_timeout
- refs/heads/abadams/gpu_autoscheduler_parallel_random_probes
- refs/heads/abadams/include_riscv_in_readme
- refs/heads/abadams/interleave_nested_vector
- refs/heads/abadams/ir_match_by_ref
- refs/heads/abadams/lerp_plus_cast
- refs/heads/abadams/local_laplacian_code_size
- refs/heads/abadams/lower_halving_sub
- refs/heads/abadams/lower_rounding_shift_right
- refs/heads/abadams/mac-arm-fixes
- refs/heads/abadams/make_fast_inverse_test_throughput_limited
- refs/heads/abadams/makefile_serialization_support
- refs/heads/abadams/mismatched_new_delete
- refs/heads/abadams/mixed_sign_mul_shift_right
- refs/heads/abadams/mixed_width_mul_shift_right
- refs/heads/abadams/multiple_scatter
- refs/heads/abadams/mux_intrinsic
- refs/heads/abadams/name_helpers
- refs/heads/abadams/narrow_predicates
- refs/heads/abadams/nested_vectorization_compile_time_regression_fix
- refs/heads/abadams/nested_vectorization_tweaks
- refs/heads/abadams/parallel_simd_op_check
- refs/heads/abadams/per_instance_profiling
- refs/heads/abadams/precompute_shared_mem_size
- refs/heads/abadams/prefer_no_gather
- refs/heads/abadams/print_uncaught_exception
- refs/heads/abadams/promote_fixed_point_intrinsics
- refs/heads/abadams/psabdw
- refs/heads/abadams/random_pipelines
- refs/heads/abadams/rationalize_gpu_for_loop_names
- refs/heads/abadams/reenable_unscheduled_stage_warning
- refs/heads/abadams/reinterpret_vector
- refs/heads/abadams/remove_arch_os_for_shaders
- refs/heads/abadams/remove_bad_pruning
- refs/heads/abadams/remove_parameter_self_references
- refs/heads/abadams/remove_readnone_on_functions
- refs/heads/abadams/remove_use_of_python_config_in_onnx_makefile
- refs/heads/abadams/reschedule_bgu
- refs/heads/abadams/reschedule_bilateral_grid
- refs/heads/abadams/rewrite_atomic_pass
- refs/heads/abadams/rounding_shift_right_use_average
- refs/heads/abadams/rungenmain_error
- refs/heads/abadams/sampling_profiler_overhead_v2
- refs/heads/abadams/scope_improvements
- refs/heads/abadams/simpler_broadcasts
- refs/heads/abadams/simplify_correlated_pyramid
- refs/heads/abadams/siotas_20
- refs/heads/abadams/sioutas_20
- refs/heads/abadams/slide_over_split_loop
- refs/heads/abadams/sorting_network_working_branch
- refs/heads/abadams/stable_topological_order
- refs/heads/abadams/string_view
- refs/heads/abadams/strip_asserts_last
- refs/heads/abadams/switch_stmt
- refs/heads/abadams/target_specific_lerp
- refs/heads/abadams/time_lowering_passes
- refs/heads/abadams/track_failedness_through_solver_lets
- refs/heads/abadams/turn_off_slp_vectorization_for_avx512
- refs/heads/abadams/tweak_unpack_buffers
- refs/heads/abadams/undo_pointless_widening
- refs/heads/abadams/unordered_blocks
- refs/heads/abadams/unsigned_demosaic
- refs/heads/abadams/update_makefile_for_llvm_19
- refs/heads/abadams/use_arm_for_runtime_triple
- refs/heads/abadams/use_pmaddubsw_for_downsample
- refs/heads/abadams/validate_gpu_schedules
- refs/heads/abadams/vector_reduce_hexagon_predicate
- refs/heads/abadams/vector_scan
- refs/heads/abadams/vst_type_fix
- refs/heads/abadams/widening_let_bug
- refs/heads/abadams/x86_avg
- refs/heads/abadams/zen4
- refs/heads/adadams/profile_allocator
- refs/heads/add_image_checks_after_bounds_inference_plus_new_rules
- refs/heads/add_outermost_to_extern
- refs/heads/add_vectorization_to_search_space
- refs/heads/aelphy/feature_cadence_changes
- refs/heads/aelphy/float_extracts
- refs/heads/align_loads_comment_fix
- refs/heads/alina-strided-store
- refs/heads/another_buffer_copy_fix
- refs/heads/arm_sve_redux
- refs/heads/ataei-block_asserts-codegen
- refs/heads/ataei-debug_info
- refs/heads/ataei-fix-pow
- refs/heads/ataei-gen_str_param
- refs/heads/ataei-implicit_lhs_vars
- refs/heads/ataei-onnx
- refs/heads/ataei-onnx_converter_update
- refs/heads/ataei-onnx_pybind
- refs/heads/ataei-resnet50_benchmarks
- refs/heads/ataei-standalone_autoscheduler
- refs/heads/ataei_lots_of_inputs
- refs/heads/auto_sched_benchmarks
- refs/heads/auto_sched_estimates
- refs/heads/auto_sched_inline
- refs/heads/auto_sched_test_notparallel
- refs/heads/autoschedule_top_down
- refs/heads/autoschedule_with_convnet
- refs/heads/autoscheduler_scalar_imageparam_fix
- refs/heads/backports/10.x
- refs/heads/backports/11.x
- refs/heads/backports/12.x
- refs/heads/backports/13.x
- refs/heads/balance_expressions
- refs/heads/bazel
- refs/heads/benchmarks
- refs/heads/blaze
- refs/heads/bounds_buffer_lets_fix
- refs/heads/bounds_correct_vs_bounds_loaded_reduced
- refs/heads/buffer_device_api_target
- refs/heads/bug_device_free
- refs/heads/bug_inline_unbounded
- refs/heads/build/fix-xcode-2
- refs/heads/build/manylinux-fixes
- refs/heads/circ_buffer
- refs/heads/cmake-no-runtime-debug-symbols
- refs/heads/cmake/asan
- refs/heads/cmake/deps-cleanup
- refs/heads/cmake/find-modules
- refs/heads/cmake/spirv
- refs/heads/cmake_wasm_features
- refs/heads/compute_at_guard_with_if_goes_on_stack
- refs/heads/compute_with_at
- refs/heads/compute_with_check
- refs/heads/compute_with_excessive_bounds
- refs/heads/compute_with_inlined
- refs/heads/compute_with_remove_is_right_level
- refs/heads/cpack/nuget
- refs/heads/ctest/wrappers
- refs/heads/cuda-constant
- refs/heads/d3d12-allocation-cache
- refs/heads/deferred_cse_after_inlining
- refs/heads/destructor_calls_deinit
- refs/heads/dg/deserialize_unmapped_objects
- refs/heads/dg/fix_vulkan_codegen_bool_conversion
- refs/heads/dg/vulkan_conform_api
- refs/heads/dg/vulkan_region_allocator_fixes
- refs/heads/dgerstmann/fix-vulkan-memory-config-init
- refs/heads/disable_acquire_release_test_vulkan
- refs/heads/distinct_wrapper_names
- refs/heads/dkg/6863_asan_fixes
- refs/heads/dkg/vulkan
- refs/heads/dpalermo_dmabuf
- refs/heads/dpalermo_dmabuf_libion
- refs/heads/dpalermo_hexagon_remote_202003
- refs/heads/dpalermo_sdk4_2_0_2
- refs/heads/ds/buffer-get-pure
- refs/heads/ds/opt-tile-size
- refs/heads/ds/tail-none
- refs/heads/ds/while
- refs/heads/dsharletg/bitwise-intrinsics
- refs/heads/dsharletg/find-vector-reduce
- refs/heads/dsharletg/jit-optimization
- refs/heads/dsharletg/memcpy-copy_from
- refs/heads/dsharletg/pattern-headroom
- refs/heads/dsharletg/refactor-host-alignment
- refs/heads/dsharletg/runtime-size
- refs/heads/dsharletg/simplify-abs
- refs/heads/dsharletg/simplify-type-bounds
- refs/heads/dsharletg/specialize-bounds
- refs/heads/dsharletg/upsample-channels
- refs/heads/empty_prefetch
- refs/heads/emscripten_vector_fix
- refs/heads/export_all-wsmoses
- refs/heads/expr_auto_sched
- refs/heads/extern_bugs
- refs/heads/extern_host_alloc
- refs/heads/factor_parallel_codegen_hack
- refs/heads/fast_sync_tsan
- refs/heads/faster_integer_division
- refs/heads/feature/apps-external
- refs/heads/feature/cmake-presets
- refs/heads/feature/convert
- refs/heads/feature/f16_interleave
- refs/heads/feature/gather_load_q7
- refs/heads/feature/llvm-codemodel
- refs/heads/feature/load_predicated
- refs/heads/feature/luma_regression
- refs/heads/feature/maintanence
- refs/heads/feature/reinterprets
- refs/heads/feature/tcm_bump_allocator
- refs/heads/feature/xtensa_fix_interleave_q8
- refs/heads/feature/xtensa_q8_tests
- refs/heads/find_intrinsics_issue
- refs/heads/find_intrinsics_widening_lets
- refs/heads/fix-floated-pure-stage
- refs/heads/fix-race-condition
- refs/heads/fix_hexagon_alignment
- refs/heads/fix_hvx_intrinsics
- refs/heads/fix_prefetch_test
- refs/heads/fix_windows_vs15_build
- refs/heads/fixed_length_vectors
- refs/heads/fixed_point_local_laplac
- refs/heads/gemmlowp
- refs/heads/generate
- refs/heads/gha/pip
- refs/heads/gpu_canon_fix
- refs/heads/halide_ir_flatbuffer
- refs/heads/hex_dma2_async
- refs/heads/hexagon_le_runtime
- refs/heads/hexagon_priority
- refs/heads/hexagon_setpriority
- refs/heads/hexagon_strided_pred_load
- refs/heads/hexagon_sysmon_markers
- refs/heads/imaging-synthesis
- refs/heads/includes_fix
- refs/heads/ios_fast_sync_fix
- refs/heads/jia-kai-fix-runtime-cuda-init
- refs/heads/kamil-openglcompute-infinity
- refs/heads/kamil/name_pthread_workers
- refs/heads/kp_bit_shift
- refs/heads/line_buffer
- refs/heads/loop_carry_not_working
- refs/heads/lower_on_huge_stack
- refs/heads/main
- refs/heads/master
- refs/heads/memoize_with_extents
- refs/heads/metal_float16
- refs/heads/metaprogrammed_simplifier_mod
- refs/heads/mohamedadaly-vmlal
- refs/heads/more_powerful_sliding
- refs/heads/new_autoschedule_with_new_simplifier_arm_worker_branch
- refs/heads/new_autoscheduler
- refs/heads/new_simplifier_rule_testing
- refs/heads/newer_ion_ioctl
- refs/heads/no_bounds_query_when_bounds_used
- refs/heads/opengl_compute_buffer_types_fix
- refs/heads/openglcompute_reuse_shared_allocations
- refs/heads/optmize_reorder
- refs/heads/par_for_opt
- refs/heads/pdb/fix_7806
- refs/heads/pdb/hexagon_remote_cmake
- refs/heads/pdb_add_libcpp_makefile_inc
- refs/heads/pdb_eliminate_interleaves_test
- refs/heads/pdb_fix_clang_build
- refs/heads/pdb_fix_install_qc
- refs/heads/pdb_fix_loop_carry
- refs/heads/pdb_fix_simd_op_check_hvx
- refs/heads/pdb_mul_div_mod_multi_thread
- refs/heads/pdb_remove_hvx_v64
- refs/heads/perform_inline_with_order
- refs/heads/pr/2572
- refs/heads/pr/2676
- refs/heads/pr/2975
- refs/heads/pr/3017
- refs/heads/pr/3081
- refs/heads/pr/3387
- refs/heads/pr/3939
- refs/heads/pr/3960
- refs/heads/pr/4380
- refs/heads/pr/4414
- refs/heads/pr/5331
- refs/heads/pr/5438
- refs/heads/pr/5455
- refs/heads/pr/5758_2
- refs/heads/predicated_vector
- refs/heads/prefetch_specialize
- refs/heads/print_schedule
- refs/heads/profile_hardware_counters
- refs/heads/random-pipelines
- refs/heads/rdom_with_pure_vars
- refs/heads/readme-fix-gcd
- refs/heads/realization_order
- refs/heads/refactor_module
- refs/heads/register_promotion
- refs/heads/release/10.x
- refs/heads/release/11.x
- refs/heads/release/12.x
- refs/heads/release/13.x
- refs/heads/release/14.x
- refs/heads/release/15.x
- refs/heads/release/16.x
- refs/heads/release/17.x
- refs/heads/release/8.x
- refs/heads/remove_max_on_fuse_factor
- refs/heads/reorder_rvar
- refs/heads/reset_unique_counter
- refs/heads/revert-3612-ataei-speedup_compiletime
- refs/heads/revert-7009-rootjalex/distribute-w_shl
- refs/heads/revert-7601-compile_hexagon_remote
- refs/heads/riscv_update
- refs/heads/rl_simplifier_rules
- refs/heads/rootjalex/add_simpl_rules
- refs/heads/rootjalex/arm-optimize
- refs/heads/rootjalex/autoscheduler_mcts
- refs/heads/rootjalex/bounds-rewriter
- refs/heads/rootjalex/bounds_synthesis
- refs/heads/rootjalex/cbounds
- refs/heads/rootjalex/cbounds_predicated
- refs/heads/rootjalex/fix-sat-overflow
- refs/heads/rootjalex/fix_estimate_issue
- refs/heads/rootjalex/fix_failed_unrolls
- refs/heads/rootjalex/gsoc_codegen
- refs/heads/rootjalex/improve_cbounds_fixed
- refs/heads/rootjalex/improve_constant_bounds
- refs/heads/rootjalex/pitchfork-arm
- refs/heads/rootjalex/reinterpret-simplify
- refs/heads/rootjalex/rts
- refs/heads/rootjalex/super_simplify_bounds
- refs/heads/rootjalex/test_cbounds_fixed
- refs/heads/rootjalex/test_constant_bounds
- refs/heads/rootjalex/trs-codegen
- refs/heads/rootjalex/trs-codegen-cross
- refs/heads/rootjalex/trs-merge
- refs/heads/rootjalex/uint32-int32-cast
- refs/heads/rootjalex/x86-hadds
- refs/heads/rootjalex/x86-optimize
- refs/heads/rootjalex/x86-optimize-test
- refs/heads/rootjalex/x86-sat
- refs/heads/rootjalex/x86-test
- refs/heads/rule_removal_experiments
- refs/heads/schedule-output-storage
- refs/heads/separate_bounds_query_entrypoint
- refs/heads/shallow
- refs/heads/shift_amount_type_change
- refs/heads/shoaibkamil/cmake-without-arm
- refs/heads/shoaibkamil/correct_memory_fences
- refs/heads/shoaibkamil/d3d-fixes
- refs/heads/shoaibkamil/deprecate_openglcompute
- refs/heads/shoaibkamil/json
- refs/heads/shoaibkamil/llvm_clone_tag
- refs/heads/shoaibkamil/minor-vcpkg-doc-change
- refs/heads/shoaibkamil/opengl_compute_tests
- refs/heads/shoaibkamil/performance_tests_as_generators
- refs/heads/shoaibkamil/rule_removal_experiments
- refs/heads/shoaibkamil/super_simplify_with_interpreter
- refs/heads/shoaibkamil/windows-arm-fix-attributes
- refs/heads/sim_shlib_addr_print
- refs/heads/simplify-nested-broadcasts
- refs/heads/simplify-vectorreduce-shuffles2
- refs/heads/simplify_mod
- refs/heads/sioutas_2020
- refs/heads/sioutas_2020_autoscheduler
- refs/heads/slomp/gpu-codegen-profiling
- refs/heads/slomp/msvc-static-analysis
- refs/heads/solve_div
- refs/heads/solve_div_master
- refs/heads/solve_div_simplifier_test
- refs/heads/sr/python-late-binding-defaults
- refs/heads/srj-aaa
- refs/heads/srj-alloc
- refs/heads/srj-alloca
- refs/heads/srj-appmake2
- refs/heads/srj-armv83a
- refs/heads/srj-aslog
- refs/heads/srj-assert
- refs/heads/srj-assoc
- refs/heads/srj-auto-multi
- refs/heads/srj-auto-multi2
- refs/heads/srj-auto_schedule_mat_mul
- refs/heads/srj-autosched
- refs/heads/srj-b2cpphide
- refs/heads/srj-barr
- refs/heads/srj-bits
- refs/heads/srj-blacklist
- refs/heads/srj-bounds
- refs/heads/srj-bufcalltype
- refs/heads/srj-bufcallwrap
- refs/heads/srj-bufcallwrap2
- refs/heads/srj-buffer
- refs/heads/srj-bv
- refs/heads/srj-classic-autotune
- refs/heads/srj-clean
- refs/heads/srj-constcall
- refs/heads/srj-crosscompile
- refs/heads/srj-ctlz
- refs/heads/srj-cvec-patch
- refs/heads/srj-dag
- refs/heads/srj-debug-to-file
- refs/heads/srj-deir
- refs/heads/srj-f16
- refs/heads/srj-fp16
- refs/heads/srj-fsch
- refs/heads/srj-fthru
- refs/heads/srj-g2
- refs/heads/srj-g3
- refs/heads/srj-gha-test-fixes
- refs/heads/srj-hidden
- refs/heads/srj-hide2
- refs/heads/srj-hvx
- refs/heads/srj-hvx-bug
- refs/heads/srj-hvx-codegen-bug
- refs/heads/srj-hvx-nocopy
- refs/heads/srj-hvxshift
- refs/heads/srj-iib
- refs/heads/srj-initshape
- refs/heads/srj-inv
- refs/heads/srj-ir
- refs/heads/srj-irmut2
- refs/heads/srj-iwyu
- refs/heads/srj-iwyu3
- refs/heads/srj-javascript_work_in_progress
- refs/heads/srj-lensblur
- refs/heads/srj-lessinc
- refs/heads/srj-llvm-loop-opt
- refs/heads/srj-mak
- refs/heads/srj-maxthreads
- refs/heads/srj-mod
- refs/heads/srj-msan
- refs/heads/srj-msan-call
- refs/heads/srj-muldivmod
- refs/heads/srj-mut
- refs/heads/srj-outputs-2
- refs/heads/srj-parse
- refs/heads/srj-pch
- refs/heads/srj-printfunc
- refs/heads/srj-pygp
- refs/heads/srj-revertbits
- refs/heads/srj-schedule-storage
- refs/heads/srj-shl-shr-2
- refs/heads/srj-sio
- refs/heads/srj-static-const
- refs/heads/srj-strided-store
- refs/heads/srj-tidyh
- refs/heads/srj-tiff
- refs/heads/srj-trace
- refs/heads/srj-tutorial
- refs/heads/srj-using
- refs/heads/srj-wasmfix
- refs/heads/srj-xor2
- refs/heads/srj/abstract-gen-without-get-output-func-KEEP
- refs/heads/srj/aligned-alloc
- refs/heads/srj/aligned-alloc-2
- refs/heads/srj/aligned-malloc-with-aligned-alloc
- refs/heads/srj/all-explicit-ctor
- refs/heads/srj/anderson-thread-info-ptr
- refs/heads/srj/aot-perf
- refs/heads/srj/argv-signatures
- refs/heads/srj/argv-types
- refs/heads/srj/async-test
- refs/heads/srj/b2cpp-const-data
- refs/heads/srj/better-xt-dispatch
- refs/heads/srj/bfloat1
- refs/heads/srj/bp
- refs/heads/srj/build_halide_h
- refs/heads/srj/c-bool
- refs/heads/srj/cache-clear
- refs/heads/srj/clang-fmt-ignore
- refs/heads/srj/clang-tidy
- refs/heads/srj/clear-c-cache
- refs/heads/srj/cmake-asan
- refs/heads/srj/cmake-asan2
- refs/heads/srj/cmake-jit-generators
- refs/heads/srj/configure-cmake
- refs/heads/srj/cpp-generator-v2-experiment-KEEP
- refs/heads/srj/crosscompile
- refs/heads/srj/ctad
- refs/heads/srj/depr
- refs/heads/srj/deprecation
- refs/heads/srj/device-copy
- refs/heads/srj/example
- refs/heads/srj/experiment
- refs/heads/srj/experiment-6967
- refs/heads/srj/exporting
- refs/heads/srj/expr_t
- refs/heads/srj/external-tensors
- refs/heads/srj/fix-pytorch
- refs/heads/srj/fixed-rollback
- refs/heads/srj/fopen-fix
- refs/heads/srj/forward
- refs/heads/srj/forward-name
- refs/heads/srj/gen-func
- refs/heads/srj/gen-func-2
- refs/heads/srj/gen-func-3
- refs/heads/srj/gen2-1
- refs/heads/srj/gen_closure
- refs/heads/srj/generator_aot_gpu_multi_context_threaded
- refs/heads/srj/globals
- refs/heads/srj/halide-buffer-crop
- refs/heads/srj/halide-malloc-alignment
- refs/heads/srj/halide-must-use
- refs/heads/srj/halide-runtime-must-use-result
- refs/heads/srj/hang-repro
- refs/heads/srj/hannk
- refs/heads/srj/hannk-aliasing
- refs/heads/srj/hannk-error-checking
- refs/heads/srj/hannk-errors
- refs/heads/srj/hannk-inplace
- refs/heads/srj/hannk-mmap
- refs/heads/srj/hannk-tflite-27
- refs/heads/srj/hannk-verbosity
- refs/heads/srj/hdrs
- refs/heads/srj/html-becomes-viz
- refs/heads/srj/implicit-mult-widening
- refs/heads/srj/issue-7076
- refs/heads/srj/iwyu
- refs/heads/srj/iwyu-2
- refs/heads/srj/iwyu-6
- refs/heads/srj/libHANNK
- refs/heads/srj/llvm_type_of
- refs/heads/srj/maybe-unused
- refs/heads/srj/meanop
- refs/heads/srj/metadata-calling-convention
- refs/heads/srj/more-tidy
- refs/heads/srj/msan-dtf
- refs/heads/srj/multimeta
- refs/heads/srj/nanobind
- refs/heads/srj/new-rt-1
- refs/heads/srj/no-threadpool
- refs/heads/srj/no-timeout-thread
- refs/heads/srj/oglc-mutexed
- refs/heads/srj/param-map
- refs/heads/srj/pip-15.x
- refs/heads/srj/pip-cron
- refs/heads/srj/possible-uninited
- refs/heads/srj/pr-7566
- refs/heads/srj/printer-size
- refs/heads/srj/profiler-data-race
- refs/heads/srj/ptr-int-cast
- refs/heads/srj/pyapps
- refs/heads/srj/pyext-fix
- refs/heads/srj/pygen-class
- refs/heads/srj/pygen-deux
- refs/heads/srj/pygen-func
- refs/heads/srj/pygen-native-types
- refs/heads/srj/pyinstall
- refs/heads/srj/pypi-try
- refs/heads/srj/pystuff
- refs/heads/srj/python-buffer-unpack
- refs/heads/srj/python-tutorial
- refs/heads/srj/reshape
- refs/heads/srj/rt-error-smallify
- refs/heads/srj/rt-return-types
- refs/heads/srj/runtime-error-handling
- refs/heads/srj/sat-fixes-exp
- refs/heads/srj/sat-fixes-exp-2
- refs/heads/srj/shadow-field
- refs/heads/srj/snprintf
- refs/heads/srj/spirv-license
- refs/heads/srj/stat-buf-deprecations
- refs/heads/srj/static-buffer-generators
- refs/heads/srj/stmt-html
- refs/heads/srj/stringify
- refs/heads/srj/synth-gen-params
- refs/heads/srj/synth-params-python
- refs/heads/srj/test-arm_sve_redux
- refs/heads/srj/test-intrinsics-bounds
- refs/heads/srj/test8076
- refs/heads/srj/test8078
- refs/heads/srj/test8094
- refs/heads/srj/test8105a
- refs/heads/srj/test8115
- refs/heads/srj/test_tmpdir_fix
- refs/heads/srj/tidy
- refs/heads/srj/tidy-format-14
- refs/heads/srj/tidymore
- refs/heads/srj/tidymore2
- refs/heads/srj/tls
- refs/heads/srj/tls-3
- refs/heads/srj/tls-4
- refs/heads/srj/tls-ucon
- refs/heads/srj/tmp-unschedule-experiment
- refs/heads/srj/tot-fix
- refs/heads/srj/try-revert-sat
- refs/heads/srj/type-traits
- refs/heads/srj/typed-func
- refs/heads/srj/ucon-all-const
- refs/heads/srj/ucon-non-const
- refs/heads/srj/visit-warnings
- refs/heads/srj/wasm-atomic2
- refs/heads/srj/wasm-simd
- refs/heads/srj/wasm-stuff
- refs/heads/srj/wasm-threads
- refs/heads/srj/wasm-updates
- refs/heads/srj/wasm-work
- refs/heads/srj/wip
- refs/heads/srj/x-rounding
- refs/heads/srj/xbuf
- refs/heads/srj/xc+plus+size+tmp
- refs/heads/srj/xc-types
- refs/heads/srj/xt-uint-cast-test
- refs/heads/srj/xtensa-arch
- refs/heads/srj/xtensa-merge
- refs/heads/srj/xvc-experimetn
- refs/heads/srj/zlib-embed
- refs/heads/standalone_autoscheduler
- refs/heads/standalone_autoscheduler_arm_worker
- refs/heads/standalone_autoscheduler_arm_worker_amazon
- refs/heads/standalone_autoscheduler_gpu
- refs/heads/standalone_autoscheduler_hexagon
- refs/heads/sticky_task_assignments
- refs/heads/store_with
- refs/heads/store_with_solver_for_super_simplify
- refs/heads/strict_float_cse_fix
- refs/heads/super_simplify
- refs/heads/super_simplify_v2
- refs/heads/super_simplify_v3
- refs/heads/transitive_wrapper
- refs/heads/trigger-release-v16
- refs/heads/tzumao-autodiff-boundarycond
- refs/heads/tzumao-gradient-autoscheduler-bug
- refs/heads/tzumao-predicate-store-load
- refs/heads/tzumao-python-buffer
- refs/heads/tzumao_autodiff_unbounded
- refs/heads/tzumao_improve_gradient_autoscheduler
- refs/heads/tzumao_issue_4297
- refs/heads/tzumao_licm_before_BI
- refs/heads/unbounded_bugs
- refs/heads/undo_async_copy_chain_black_list
- refs/heads/use_string_literals_for_blobs
- refs/heads/users/lukas/python-pip
- refs/heads/validate_sched_error_msg
- refs/heads/var_ir_fix
- refs/heads/vksnk/async-experiment
- refs/heads/vksnk/async-multiple-producers
- refs/heads/vksnk/async-order
- refs/heads/vksnk/better-loop-carry
- refs/heads/vksnk/better-message
- refs/heads/vksnk/bound-storage
- refs/heads/vksnk/bounds-widen-right
- refs/heads/vksnk/c-print-type
- refs/heads/vksnk/c-round
- refs/heads/vksnk/check-return-result
- refs/heads/vksnk/compute-with-bug
- refs/heads/vksnk/compute_with_async
- refs/heads/vksnk/dma-limit-channels
- refs/heads/vksnk/dma-min-max
- refs/heads/vksnk/expr-match-shuffle
- refs/heads/vksnk/extract-from-scalar
- refs/heads/vksnk/f16-load
- refs/heads/vksnk/fix-packvr
- refs/heads/vksnk/fix_halide_xtensa_narrow_with_rounding_shift_i16
- refs/heads/vksnk/fused-compute-with
- refs/heads/vksnk/hoist-storage-bug
- refs/heads/vksnk/lerp-intrinsics
- refs/heads/vksnk/lower-signed-shifts
- refs/heads/vksnk/missing-exception
- refs/heads/vksnk/non-widening-halves
- refs/heads/vksnk/optimize-shuffles
- refs/heads/vksnk/replace-all
- refs/heads/vksnk/restrict
- refs/heads/vksnk/roll-buffer
- refs/heads/vksnk/roundeven-arm
- refs/heads/vksnk/rvar-bounds
- refs/heads/vksnk/simplify-slice
- refs/heads/vksnk/skip-semaphores
- refs/heads/vksnk/storage-folding
- refs/heads/vksnk/strided-load-of-4_2
- refs/heads/vksnk/typed-scope
- refs/heads/vksnk/update-simd-driver
- refs/heads/vksnk/vectorize-bug
- refs/heads/vksnk/vectorize-scalarize
- refs/heads/vksnk/widening_absd
- refs/heads/vksnk/xtensa-codegen-fp16
- refs/heads/vksnk/xtensa-dma-improvements
- refs/heads/vksnk/xtensa-regroup-pass
- refs/heads/vksnk/xtensa/lift-allocs
- refs/heads/vulkan
- refs/heads/vulkan-diagnose-alloc-failures
- refs/heads/vulkan-phase0-adts
- refs/heads/vulkan-phase1-spirv
- refs/heads/vulkan-phase2-runtime
- refs/heads/vulkan2
- refs/heads/vulkan_fix_gpu_dynamic_shared_test
- refs/heads/vulkan_fix_subregion_memory_offsets
- refs/heads/webassembly-old
- refs/heads/winograd
- refs/heads/wording_fix
- refs/heads/xtensa-codegen
- refs/heads/xtensa-codegen-parallel
- refs/heads/xuanda/fix-serialize-bad-partition-always
- refs/remotes/origin/rootjalex/add_autosched_caching
- refs/tags/release_2018_02_15
- refs/tags/release_2019_08_27
- refs/tags/release_8.0.0
- refs/tags/v10.0.0
- refs/tags/v10.0.1
- refs/tags/v11.0.0
- refs/tags/v11.0.1
- refs/tags/v12.0.0
- refs/tags/v12.0.1
- refs/tags/v13.0.0
- refs/tags/v13.0.1
- refs/tags/v13.0.2
- refs/tags/v13.0.3
- refs/tags/v13.0.4
- refs/tags/v14.0.0
- refs/tags/v15.0.0
- refs/tags/v15.0.1
- refs/tags/v16.0.0
- refs/tags/v17.0.0
- refs/tags/v17.0.1
- refs/tags/v8.0.0
Take a new snapshot of a software origin
If the archived software origin currently browsed is not synchronized with its upstream version (for instance when new commits have been issued), you can explicitly request Software Heritage to take a new snapshot of it.
Use the form below to proceed. Once a request has been submitted and accepted, it will be processed as soon as possible. You can then check its processing state by visiting this dedicated page.Processing "take a new snapshot" request ...
Permalinks
To reference or cite the objects present in the Software Heritage archive, permalinks based on SoftWare Hash IDentifiers (SWHIDs) must be used.
Select below a type of object currently browsed in order to display its associated SWHID and permalink.
Revision | Author | Date | Message | Commit Date |
---|---|---|---|---|
6ae921b | Steven Johnson | 29 January 2021, 17:48:13 UTC | Avoid bogus out-of-memory error for multiple_scatter under wasm | 29 January 2021, 17:48:13 UTC |
6118a62 | Steven Johnson | 29 January 2021, 17:27:30 UTC | Disable a few more wasm-simd ops in simd_op_check (#5679) Recent changes to the final wasm-simd spec means that some instructions aren't being generated (and may not even exist in the same form). Commented out for now; we need to revisit this once the LLVM backend for wasm gets closer to up-to-date with the final spec. | 29 January 2021, 17:27:30 UTC |
288526c | Dillon Sharlet | 28 January 2021, 17:52:16 UTC | Encapsulate a few more symbols (#5672) * Encapsulate more symbols. | 28 January 2021, 17:52:16 UTC |
f427ad1 | Steven Johnson | 28 January 2021, 17:51:11 UTC | Remove deprecated variants of infer_input_bounds() in the Python bindings (#5673) * Remove deprecated variants of infer_input_bounds() in the Python bindings The C++ versions were removed for Halide 12 already, but I missed the Python wrappers. * trigger buildbots | 28 January 2021, 17:51:11 UTC |
813eadc | Alex Reinking | 28 January 2021, 09:41:44 UTC | Fix target detection for i686 (#5675) | 28 January 2021, 09:41:44 UTC |
a8299b5 | Steven Johnson | 27 January 2021, 21:22:40 UTC | Allow LLVM-13 and Clang-13 (#5674) | 27 January 2021, 21:22:40 UTC |
6e3fb56 | Steven Johnson | 26 January 2021, 04:09:53 UTC | FIx intermittent OSX Python crash (#5667) * FIx intermittent OSX Python crash The OSX buildbot has been crashing intermittently on some python tests; debugging showed that in some situations, Introspection's calls to `backtrace()` include bogus addresses (eg 0x08), which cause segfaults when you try to inspect memory near them. The reasons for this aren't entirely clear -- for instance, it only seems to repeat reliably when using the Makefile rather than CMake, and only when doing an 'out-of-tree' build. Rather than try to run this to ground further, this PR just checks for address fields that seem obviously unreasonable (first 256 bytes of address space) and ignore them. * Add -fno-omit-frame-pointer, update sanity check * Update Introspection.cpp | 26 January 2021, 04:09:53 UTC |
d4c27ca | Dillon Sharlet | 25 January 2021, 21:51:57 UTC | Lower saturating arithmetic without widening (#5662) * Lower saturating arithmetic without widening, and handle it in lower_intrinsic. * clang-format, fix saturating sub * cout -> cerr * trigger buildbots Co-authored-by: Alex Reinking <alex.reinking@gmail.com> Co-authored-by: Steven Johnson <srj@google.com> | 25 January 2021, 21:51:57 UTC |
38be3e3 | aankit-ca | 23 January 2021, 01:06:20 UTC | Add rounding shift right instructions (#5664) Co-authored-by: Ankit Aggarwal <aankit@quicinc.com> | 23 January 2021, 01:06:20 UTC |
7cff481 | Dillon Sharlet | 22 January 2021, 21:18:47 UTC | Fix VSX min/max intrinsics. Fixes #5661. (#5663) | 22 January 2021, 21:18:47 UTC |
6b398a3 | Andrew Adams | 22 January 2021, 18:23:48 UTC | Better codegen for switch-statement-like if-else chains (#5595) Better codegen for switch-statement-like if-else chains And added a test that demonstrates writing a little interpreter in Halide and scheduling it. | 22 January 2021, 18:23:48 UTC |
8c57a1a | Steven Johnson | 21 January 2021, 22:07:20 UTC | Use linker tools on OSX & Linux to limit exports (#4651) (#5659) * Use linker scripts on OSX & Linux to limit exports * Write script to detect appropriate linker flags. Co-authored-by: Alex Reinking <alex.reinking@gmail.com> | 21 January 2021, 22:07:20 UTC |
be7a6a3 | Alexander Root | 21 January 2021, 21:23:43 UTC | is_positive_const and is_negative_const broken for (some) casts (#5615) * let signed_const checkers fail for non-widening integral casts Co-authored-by: Steven Johnson <srj@google.com> | 21 January 2021, 21:23:43 UTC |
0ca0415 | Steven Johnson | 21 January 2021, 01:59:10 UTC | Remove all deprecated methods for Halide 12 (#5656) * Remove all deprecated methods for Halide 12 These were all marked as deprecated in Halide 11 (and probably Halide 10 too); let's go ahead and remove them in Halide 12. * Remove function bodies too | 21 January 2021, 01:59:10 UTC |
a785b53 | Steven Johnson | 20 January 2021, 02:25:51 UTC | Add Lambda.cpp (#5651) Functions/methods that are part of the Halide public API should (generally) not be inline, to ensure the function instantiation is always in libHalide. | 20 January 2021, 02:25:51 UTC |
7713b3a | Steven Johnson | 20 January 2021, 02:25:27 UTC | Add Python & PyBind version checking to PyStubImpl.cpp (#5653) It's built separately from the rest of the Python bindings and could get out of sync separately. | 20 January 2021, 02:25:27 UTC |
57083e4 | Andrew Adams | 19 January 2021, 17:44:27 UTC | Fix cuda warp shuffle issue for narrow types (#5624) * Fix cuda warp shuffle issue for narrow types In the case where no shuffle was necessary, we were upcasting the type to 32-bits needlessly and causing chaos. | 19 January 2021, 17:44:27 UTC |
8a12c43 | Alex Reinking | 18 January 2021, 18:53:20 UTC | Upgrade pybind11 to 2.6.x (#5644) * Use pybind11 2.6.0, which fixes Python-finding bugs. * Update Generator.cpp * Update Generator.cpp * Update PyHalide.cpp * 2.6.0 -> 2.6.1 Co-authored-by: Steven Johnson <srj@google.com> | 18 January 2021, 18:53:20 UTC |
42c5182 | Alex Reinking | 15 January 2021, 19:34:45 UTC | Shrink tile size to fit in Mac Mini GPU memory. (#5647) * Shrink tile size to fit in Mac Mini GPU memory. * Fix comment per Shoaib's correction. | 15 January 2021, 19:34:45 UTC |
bb1ca3c | Steven Johnson | 14 January 2021, 22:57:28 UTC | correctness_vector_math: skip hypot() test for LLVM10 (#5643) | 14 January 2021, 22:57:28 UTC |
722b93e | Steven Johnson | 13 January 2021, 20:27:12 UTC | Add 11.1 as an acceptable LLVM version (#5640) * Add 11.1 as an acceptable LLVM version Apparently 11.1 was released but our Makefile only allows for 11.0. * Update Makefile | 13 January 2021, 20:27:12 UTC |
61ca4d2 | Steven Johnson | 13 January 2021, 20:12:26 UTC | Simplify CodeGen_OpenGLCompute_C (#5636) * Simplify CodeGen_OpenGLCompute_C Combines CodeGen_GLSLBase and CodeGen_OpenGLCompute_C into one class, removing unnecessary stuff from the OpenGL support code. | 13 January 2021, 20:12:26 UTC |
0dfdc0d | Alexander Root | 13 January 2021, 08:11:56 UTC | fix typo on assert in lerp() (#5638) | 13 January 2021, 08:11:56 UTC |
6620563 | Steven Johnson | 13 January 2021, 01:06:52 UTC | Add TARGET_OPENGLCOMPUTE (#5637) Inadvertently removed code for properly enabling/disabling OGLC in #5626, this restores it properly | 13 January 2021, 01:06:52 UTC |
4ed4db8 | Steven Johnson | 12 January 2021, 23:09:16 UTC | Remove CodeGen_OpenGL_Dev (#5635) 5626 removed OpenGL support, but didn't remove this no-longer-needed class; removed it here. Moved CodeGen_GLSLBase into CodGen_OpenGLCompute_Dev.cpp (which is now the only subclass), but didn't yet attempt to consolidate them into a single class. | 12 January 2021, 23:09:16 UTC |
3defb66 | Dillon Sharlet | 12 January 2021, 23:02:59 UTC | Pattern match intrinsics in a target independent lowering pass (#5531) * Simplify intrinsics of broadcasts to broadcasts of intrinsics. * Add pattern matching of intrinsics lowering pass. * Fix broadcast elementwise simplifications for nested broadcasts. * More target independent pattern matching. * Progress on pattern matching for ARM and Hexagon. * broadcasted -> broadcast. * Broken pattern matching. * Fix broken build. * Match without patterns. * Try to match saturating_add. * Pattern matching working for some intrinsics. * x86 simd_op_check passing. * Most x86 and Hexagon patterns working. * Rename subtract -> sub, multiply -> mul. * Add widening_left_shift. * Remove bad simplification. * Fix some missed pattern issues. * Hexagon patterns mostly working. * Fix pmaddwd patterns * Start on ARM intrinsics. * Use shift intrinsics. * Revert formatting of Hexagon intrinsic table * Revert one extra find and replace. * Add table of instructions for ARM. * Add more patterns for rounding_halving_add. * Remove unused unsigned widening subtracts * Add intrinsics test * Remove bogus patterns. * Match lanes in shifts. * Progress on multiply-add Hexagon pattern matching. * Fix multiply-subtracts * Enable constant folding of broadcasted constants. * Fix some widening patterns * Fix double widening lossless casts * Use widen/narrow helpers * All Hexagon and x86 patterns working * Fix return type of widening subtracts. * Fix rounding shift right patterns * WIP simd op check * Add CodeGen_LLVM::Intrinsic and related helpers. * Use call_elementwise_intrinsic for more patterns. * Clean up intrinsics a bit. * Use call_elementwise_intrinsic for x86. * More clean-up and comments. * Add comment * Use call_elementwise_intrinsic for pmaddwd * Remove stray comment. * Move a few more things to overloaded intrinsics * Remove unused runtime functions. * Fix some corner case target flags * ssse... * Run clang-format * Replace introspection test. * Remove x86_avx512 initmod * clang-tidy * Remove x86_avx512 from makefile too * Revert simd_op_check * clang-format off on tables * Fix merge conflicts * Add abs and absd support * Pattern match some absd patterns * Clean up dead logic. * Remove duplicate merge content. * Update Generator.cpp * Update Generator.cpp * Also check one sided saturating add. * Fix requirement for abs_i8x32 * Fix some saturating add/sub patterns. * More ways to express rounding halving add/sub * Add widening_shift_right * Use lower_intrinsic to handle unknown calls. * Use pattern match results. * Fix incorrect patterns * All(?!) ARM shifts working * Remove unused declarations * Don't substitute all lets * Reduce code duplication in tables * Simplify negated shifts. * Handle possible pmaddwd overload resolution failures. * Fix some overflow cases. * Small cleanups. * Review fixes * Fix some broken patterns * Don't hardcode 4 * PatternMatchIntrinsics -> FindIntrinsics * Re-enable uhsub patterns * Add useful simplication to lower_int_uint_mod * Lower unknown intrinsics. * Also check for bitwise_and * Simplify bitwise_and(x, -1). * Add back some necessary simplifications. * Fix incorrect narrowing of widening add/sub. * Fix boneheaded add/sub swap. * Skip finding intrinsics for scalar types. * Add accidentally removed break. * Improve comments. * clang-format, clang-tidy, and other fixes * Try to fix compiler-specific errors, more clang-tidy. * More clang-tidy fixes. * Tweak comments on rounding_shift_left/right. * Move default mulhi_shr and sorted_avg lowering to lower_intrinsics. * Argh clang-format. * Better coverage of shift correctness. * Fix unknown intrinsic visitor. * Add comments to rounding shifts. * Add TODO for C++14 * Use pattern matching for vector reduce. * Simplify pattern matching a little. * Remove stray newline in vectorreduce ops. * Add HL_SIMD_OP_CHECK_FILTER env variable * Fix and refactor vector reduce codegen * clang-format * pmulhrsw doesn't exist until sse3! * Fix incorrect lack of fall-through * Don't lower int/uint division in FindIntrinsics. * Fix missing check of op type. * Don't handle Mod at all. * Programmatically generate split argument intrinsic wrappers. * clang-format * More clang-format * Clean up IRMatch helpers. * Small cleanups/review comments. * Fix boneheaded bug. * Fix addp * Fix float addp * Fix incorrect rounding shift saturation patterns. * Update Hexagon vector reductions. * clang-format * Small cleanups. * clang-tidy * Fix sign of shifts on arm32 * Fix LLVM10 workaround * Work around opaque LLVM failures. * Pattern match dp4a/dp2a * clang-format * Don't try to use shift_right_narrow patterns on invalid shifts. * Don't rely on mul visitor to produce shifts. * Fix arm32 * Address some review comments. * Speculatively fix non-locally-reproducing dot product failures * Fix CUDA dot products. * clang-format * Remove debugging code. * clang-tidy * Put needed check back. * Bring back some tests/patterns. * Renable saturating_add pattern. * Bring back a few more tests. * Fix mixed sign swapped ops case. * Better implementation of handling mixed signs. * Fix some issues. * Avoid ADL fights with common user helper functions. * Don't try to find intrinsics for bool operations. * Remove redundant patterns. * Bring back div/mod lowering. * Update PowerPC to use pattern matched intrinsics. Co-authored-by: Steven Johnson <srj@google.com> | 12 January 2021, 23:02:59 UTC |
27f55dd | Steven Johnson | 12 January 2021, 17:55:36 UTC | Remove OpenGL support (part 1) (#5626) * Remove OpenGL support (part 1) Fixes #5475 This removes the OpenGL backend (but *not* the OpenGLCompute backend) from public use: - Remove Target::OpenGL - remove DeviceAPI::GLSL - remove Func::glsl() and Func::shader() - remove all OpenGL-specific apps and tests - remove HalideRuntimeOpenGL.h - remove some internal code that is OpenGL-only Note that there is still internal code that needs trimming; since the OpenGLCompute backend uses some of the same code, and some of the same build deps, and some of the same runtime shared-library loading, I tried to err on the side of leaving code/buildrules/etc in place for now, with the plan to clean that up in subsequent PRs. Note also that feature Target::EGL is still present, as I believe it is still useful in conjunction with OpenGLCompute. | 12 January 2021, 17:55:36 UTC |
9b99acb | Dillon Sharlet | 11 January 2021, 21:25:36 UTC | Clean up includes (#5584) * Remove unused wildcard/type info. * Use std::unique_ptr to avoid sketchy lifetime management. * Clean up includes * Pull some changes from small-cleanups3 * clang-format * Add missing include. * Add missing include. * clang-format. * Fix function type * clang-format * Add missing include Co-authored-by: Steven Johnson <srj@google.com> | 11 January 2021, 21:25:36 UTC |
cc03c9c | Andrew Adams | 09 January 2021, 00:28:07 UTC | Prototype of multiple scattering update definitions (#5553) Add "gather" and "scatter" intrinsics, which let you write update definitions which store multiple values at once to different computed locations. Useful for doing things like swapping or permuting elements in-place. See comments in IROperator.h for more details. | 09 January 2021, 00:28:07 UTC |
9c59d94 | Alex Reinking | 09 January 2021, 00:22:40 UTC | Set version to 12.0.0. Fixes #5259 (#5289) | 09 January 2021, 00:22:40 UTC |
2e5f1e0 | xndcn | 09 January 2021, 00:06:05 UTC | Fix issues in OpenGL backend (#5545) Co-authored-by: Alex Reinking <alex_reinking@berkeley.edu> Co-authored-by: Steven Johnson <srj@google.com> Co-authored-by: Alex Reinking <alex.reinking@gmail.com> | 09 January 2021, 00:06:05 UTC |
392b53e | Steven Johnson | 08 January 2021, 23:21:24 UTC | Check error results from all egl calls (#5619) * Check error results from all egl calls We were ignoring the result from a couple of calls. * Update opengl_egl_context.cpp * Update Generator.cpp * Update Generator.cpp | 08 January 2021, 23:21:24 UTC |
2b3aaa8 | xndcn | 08 January 2021, 19:46:56 UTC | Add max threads checking for Metal (#5588) * Add max threads checking for Metal Originally, this checking will be asserted by Metal API Validation in Xcode, otherwise the program will crash or output wrong results. * Disable the max threads checking for Metal in non-debug runtime * Disable error/metal_threads_too_large test for non-OSX target | 08 January 2021, 19:46:56 UTC |
081f472 | xndcn | 07 January 2021, 22:49:08 UTC | Add CLDoubles feature check for OpenCL double type (#5610) Similar to CLHalf feature check for half type. | 07 January 2021, 22:49:08 UTC |
46fc56a | Steven Johnson | 07 January 2021, 18:29:38 UTC | Don't allow CUDACapability80 on LLVM10 (#5617) LLVM10 can't handle that version of Cuda; we never noticed till now because we didn't have a buildbot with a GPU that could handle it. Modify the sniffers to cap capability at 75 for LLVM10 builds, and fail with user errors if that capability is explicitly requested. | 07 January 2021, 18:29:38 UTC |
f38801e | Dillon Sharlet | 06 January 2021, 17:52:38 UTC | Use std::unique_ptr to manage CodeGen classes (#5583) * Remove unused wildcard/type info. * Use std::unique_ptr to avoid sketchy lifetime management. * Pull some changes from small-cleanups3 * Use auto for some loops. * clang-tidy | 06 January 2021, 17:52:38 UTC |
8063879 | prdelgado | 05 January 2021, 21:50:53 UTC | replaced indentation in line 20 with spaces to show proper error message (#5580) * replaced indentation in line 20 with spaces to show proper error message * added error message detail with alternative solution based on PR feedback | 05 January 2021, 21:50:53 UTC |
8383cc9 | Andrew Adams | 28 December 2020, 23:49:45 UTC | Delete lane extraction code in vectorization (#5596) | 28 December 2020, 23:49:45 UTC |
890a519 | pkubaj | 23 December 2020, 01:15:44 UTC | Fix build on FreeBSD/powerpc64 (#5572) * Fix build on FreeBSD/powerpc64 FreeBSD doesn't use getauxval, but elf_aux_info. * Make the conditional only work Linux and FreeBSD * Make the conditional only for FreeBSD and Linux | 23 December 2020, 01:15:44 UTC |
8a0f4a1 | Dillon Sharlet | 22 December 2020, 01:54:36 UTC | Fix sketchy shadowing that breaks on some compilers. Fixes #5581 (#5587) * Fix sketchy shadowing that breaks on some compilers. Fixes #5581 * Fix another sketchy shadowing. | 22 December 2020, 01:54:36 UTC |
b22598c | Dillon Sharlet | 21 December 2020, 22:56:00 UTC | Remove unused wildcard/type info. (#5582) | 21 December 2020, 22:56:00 UTC |
5ac8808 | Dillon Sharlet | 21 December 2020, 21:54:37 UTC | Fix several bugs on Hexagon and some cleanup (#5570) * Fix several bugs on Hexagon. * clang-format actually found a bug | 21 December 2020, 21:54:37 UTC |
1dbcf19 | Steven Johnson | 21 December 2020, 17:08:20 UTC | Decouple wasm's +bulk-memory from wasm_threads (#5574) * Decouple wasm's +bulk-memory from threads When `wasm_threads` was added, `+bulk-memory` codegen was enabled in conjunction with this feature, due to some inscrutable error which apparently didn't get recorded. From inspection of the spec for bulk-memory, and experimentation with the most recent version of Emscripten (2.0.10), I can't find any reason that this actually needs to be enabled, so I've moved it into its own new feature flag. Also: drive-by fix in Target to format the tables in `get_runtime_compatible_target()` better, and to remove some wasm-related entries from the 'must match' table that didn't actually need to match. * Create .gitignore | 21 December 2020, 17:08:20 UTC |
ef45c87 | Steven Johnson | 19 December 2020, 00:45:34 UTC | Fix for trunk LLVM (#5576) | 19 December 2020, 00:45:34 UTC |
83b040d | Zalman Stern | 17 December 2020, 21:40:16 UTC | Add a feature to name cached memoizations and to evict them by name. (#5510) This PR adds an optional ```EvictionKey``` parameter to the ```memoize``` scheduling option. EvictionKeys are user provided labels of up to 64-bits that can be used to request that labeled items in the cache be removed to free up space. Co-authored-by: Steven Johnson <srj@google.com> | 17 December 2020, 21:40:16 UTC |
590b253 | Marcos Slomp | 17 December 2020, 19:22:09 UTC | D3D12: refactoring of kernel argument constant buffer packing (#5569) * initial setup for Direct3D 12 support for Windows-on-ARM * fixing runtime modules for Windows on ARM 64 * typo * wrapping windows_clock for Windows on ARM support * temporarily disabling windows_clock_[x86/arm] * wip * Set -fshort-wchar on generic Windows runtime target. * removing windows_clock specializations * replacing accidental tabs * addressing code review comments * Hoist fpic in CMakeFile. Mirror CMake changes to Makefile. * Add explanatory comment to Makefile * Add arm64-windows and -windows-d3d12compute targets to correctness_cross_compilation * fixed kernel parameter packing * Run clang-format * handling Bool -- UInt(1), and Int(1) as well just in case * Add previously-failing D3D12 test * Add new test to CMake Co-authored-by: Marcos Slomp <slomp@adobe.com> Co-authored-by: Shoaib Kamil <kamil@adobe.com> Co-authored-by: Shoaib Kamil <shoaibkamil@gmail.com> Co-authored-by: Steven Johnson <srj@google.com> | 17 December 2020, 19:22:09 UTC |
7fd3a7b | Marcos Slomp | 16 December 2020, 23:23:50 UTC | Windows on ARM64 support (CPU, and also GPU through D3D12) (#5544) * initial setup for Direct3D 12 support for Windows-on-ARM * fixing runtime modules for Windows on ARM 64 * typo * wrapping windows_clock for Windows on ARM support * temporarily disabling windows_clock_[x86/arm] * wip * Set -fshort-wchar on generic Windows runtime target. * removing windows_clock specializations * replacing accidental tabs * addressing code review comments * Hoist fpic in CMakeFile. Mirror CMake changes to Makefile. * Add explanatory comment to Makefile * Add arm64-windows and -windows-d3d12compute targets to correctness_cross_compilation Co-authored-by: Marcos Slomp <slomp@adobe.com> Co-authored-by: Shoaib Kamil <kamil@adobe.com> Co-authored-by: Shoaib Kamil <shoaibkamil@gmail.com> | 16 December 2020, 23:23:50 UTC |
10a01dd | Dillon Sharlet | 16 December 2020, 23:08:32 UTC | Move CodeGen_Hexagon to internal linkage and don't include it without WITH_HEXAGON (#5567) * Move CodeGen_Hexagon to internal linkage. * Move using out of the #ifdef * clang-tidy | 16 December 2020, 23:08:32 UTC |
0cfa6db | Steven Johnson | 16 December 2020, 18:19:52 UTC | Fix minor wasm issues (#5566) * Fix minor wasm issues - If wasm_threads is in the target string, be sure to launch the external shell with --experimental-wasm-threads. - All the test/performance tests should detect wasm and explicitly skip * Disable noisy warning in WABT | 16 December 2020, 18:19:52 UTC |
34d35a3 | Dillon Sharlet | 16 December 2020, 18:18:36 UTC | Add special case for printing broadcast shuffles. (#5565) | 16 December 2020, 18:18:36 UTC |
94da4f6 | cimes-isi | 15 December 2020, 22:18:36 UTC | autoscheduler: prepend, don't override LD_LIBRARY_PATH in adams2019 test (#5563) | 15 December 2020, 22:18:36 UTC |
fef82c2 | cimes-isi | 15 December 2020, 17:57:59 UTC | cmake: detect ppc64le arch (#5558) | 15 December 2020, 17:57:59 UTC |
a0bbf43 | Steven Johnson | 15 December 2020, 17:57:33 UTC | Fix fragile Makefile for apps/onnx (#5540) The protoc usage happened to work when building in the app folder but often failed when building from the toplevel Makefile. Also, drive-by silencing of noise from curl, and drive-by fix of "redundant copy" warning in onnx_converter.cc. | 15 December 2020, 17:57:33 UTC |
ee2e5df | Steven Johnson | 15 December 2020, 17:57:00 UTC | Upgrade WABT version to 1.0.20 (#5557) | 15 December 2020, 17:57:00 UTC |
d37b995 | Alex Reinking | 14 December 2020, 23:04:36 UTC | Better document how LLVM_DIR works. (#5560) | 14 December 2020, 23:04:36 UTC |
9fea59f | aankit-ca | 14 December 2020, 17:01:27 UTC | Bug fix for lossless_cast with minor additions (#5459) * Bug fix for lossless_cast with minor additions The bug can seen for types where lossless_cast type can represent cast->value.type() but not cast->type. For eg: lossless_cast(UInt(16), cast(Int(8), Variable::make(UInt(16), e))) returns (uint16)e which is incorrect. The patch also adds lossless_cast of Mod and Ramp expressions. * Handle Mod for negative numbers in lossless_cast. * Add lossless_cast test for VectorReduce. * Rename check to check_lossless_cast. * clang-format complains * Remove Ramp and Mod from lossless_cast. * Minor changes * Update test/correctness/CMakeLists.txt Co-authored-by: Ankit Aggarwal <aankit@quicinc.com> | 14 December 2020, 17:01:27 UTC |
ed8f7c2 | Dillon Sharlet | 14 December 2020, 04:46:38 UTC | Hide inaccessible symbols in internal linkage (#5548) * Hide inaccessible symbols in internal linkage. * clang-format * Remove redundant static. | 14 December 2020, 04:46:38 UTC |
f9153e8 | Steven Johnson | 13 December 2020, 22:19:38 UTC | Mark Target::OpenGL (etc) as deprecated (#5475) (#5551) | 13 December 2020, 22:19:38 UTC |
4fa78b6 | Alexander Root | 13 December 2020, 18:34:17 UTC | change StrongestExprNodeType for rewriter (#5554) | 13 December 2020, 18:34:17 UTC |
a0ddabe | Volodymyr Kysenko | 12 December 2020, 00:16:35 UTC | Remove <iostream> from the code generated by CodeGen_C (#5547) | 12 December 2020, 00:16:35 UTC |
968f6b3 | aankit-ca | 10 December 2020, 21:32:56 UTC | VectorReduce peephole matching for Hexagon (#5424) * CodeGen for VectorReduce for Hexagon * Remove use of MAKE_ID_PAIR. * Fix clang-format errors. * Spelling correction. * Address comments from PR. Use Shuffle::make_concat instead of vcombine. * Remove IROperator changes. * Address comments * Move even-odd shuffling for vrmpy to runtime .ll func * clang-format + hvx_128 changes.ll changes * clang-format * Minor changes * Minor changes * interchange vshuffvdd operand Co-authored-by: Ankit Aggarwal <aankit@quicinc.com> Co-authored-by: Steven Johnson <srj@google.com> | 10 December 2020, 21:32:56 UTC |
ad414e2 | xndcn | 10 December 2020, 17:10:50 UTC | Add possible simplify to GL(Compute) `pow` function (#5517) OpenGL(Compute) generates `select` IR for `pow(a, b)` function, which can be simplified when `a` or `b` is const. | 10 December 2020, 17:10:50 UTC |
5e526d4 | Zalman Stern | 10 December 2020, 09:40:20 UTC | Modify memoization code to allow using min/extent/stride of an input. (#5542) Modify memoization code to allow using min/extent/stride of a buffer as part of memoization without wrapping in memoize_tag. This seems reasonable and failure to support this causes tricky to diagnose errors if one uses the extent of an input in an RDom that is then used in a memoized Func. The code pattern here is a bit heuristic in that I can't think of a case where a Var has a buffer but the reference isn't to a *field* of the buffer. If this turns out to be incorrect or to become invalid in the future, the code could be extended to pattern match the variable name. | 10 December 2020, 09:40:20 UTC |
382c807 | Steven Johnson | 09 December 2020, 12:06:57 UTC | Update images used in apps/ tests (#5538) Some of them weren't the same as the Make equivalents, which meant that the test diverged between the two build systems (sometimes causing failures due to too-large images). | 09 December 2020, 12:06:57 UTC |
b83de89 | Zalman Stern | 09 December 2020, 04:25:41 UTC | Pathnames may or may not be absolute so loosen comparison to allow for this. (#5535) | 09 December 2020, 04:25:41 UTC |
873c8f1 | Zalman Stern | 08 December 2020, 21:39:04 UTC | Solve the COMDAT in runtime failing on Mac OS X problem once and for all. (#5532) Solve the COMDAT in runtime failing on Mac OS X problem once and for all by removing Comdat IR annotations in runtime on Mac OS and iOS. | 08 December 2020, 21:39:04 UTC |
42b1a6e | Dillon Sharlet | 08 December 2020, 01:11:28 UTC | Add overloaded intrinsic mechanism to simplify code generation (#5527) * Add table of instructions for ARM. * Add CodeGen_LLVM::Intrinsic and related helpers. * Use call_elementwise_intrinsic for more patterns. * Clean up intrinsics a bit. * Use call_elementwise_intrinsic for x86. * More clean-up and comments. * Add comment * Use call_elementwise_intrinsic for pmaddwd * Remove stray comment. * Move a few more things to overloaded intrinsics * Remove unused runtime functions. * Fix some corner case target flags * ssse... * Run clang-format * Replace introspection test. * Remove x86_avx512 initmod * clang-tidy * Remove x86_avx512 from makefile too * Revert simd_op_check * clang-format off on tables * Update Generator.cpp * Update Generator.cpp * Fix requirement for abs_i8x32 * Review fixes * Temporarily work around webassembly strangeness. Co-authored-by: Steven Johnson <srj@google.com> | 08 December 2020, 01:11:28 UTC |
7f70907 | Volodymyr Kysenko | 07 December 2020, 17:17:14 UTC | Combine align and slice for the small vectors in align_loads (#5497) * Combine align and slice for the small vectors in align_loads * Fix format | 07 December 2020, 17:17:14 UTC |
1800dc2 | Volodymyr Kysenko | 07 December 2020, 01:45:39 UTC | Simplify a slice of slice (#5495) * Simplify a slice of slice * Fix format * Simplify for slice of concats + tests * format * format * New line to improve readability Co-authored-by: Steven Johnson <srj@google.com> | 07 December 2020, 01:45:39 UTC |
bd53b47 | Volodymyr Kysenko | 06 December 2020, 20:39:26 UTC | Allow creation of IntImm/UIntImm with any number of bits up to 64 (#5441) * Allow creation of IntImm/UIntImm with any number of bits up to 64 * Changes: - check that the number of bits is >= 1 - modify upgrade_* functions - allow printing of type with arbitrary number of bits. * Fix format * next_power_of_two which will end Co-authored-by: Steven Johnson <srj@google.com> | 06 December 2020, 20:39:26 UTC |
7ea09cd | Alex Reinking | 06 December 2020, 07:38:56 UTC | Point fft JIT tests to Halide binary (#5521) | 06 December 2020, 07:38:56 UTC |
d325e13 | Dillon Sharlet | 04 December 2020, 16:38:05 UTC | Add simd_op_check tests and a few more patterns (#5519) * Add simd_op_check coverage of some ARM ops we generate. * Remove local filter option. * Fix expected patterns for arm32. | 04 December 2020, 16:38:05 UTC |
c1885fc | Alexander Root | 04 December 2020, 00:14:07 UTC | Fixes to bounds inference on shift_left (#5477) * Add shift_left fix for signed integers by possibly negative values + regression test * add required condition on shift_left integer fix * add type check to shift_left minimum condition * fix constant folding of shifts with |b| >= type.bits() for types that allow overflow (failes correctness/simplify test) * make regression tests use scoped bindings * change condition in case int24/int48 proposal happens soon * revert changes based on overflow expectations * add more regression tests * clarify comment * add shift_left min handler for b only UB * fix clang-tidy complaint * relax shift_left of non-negative value constraint * pull case outside of unnecessary preconditions * fix clang-format complaint * fix broken precondition * add typecheck to possibly save a can_prove() call * add easy-out type check to precondition * Add descriptive comment to bug fix + add another early-exit precondition Co-authored-by: Steven Johnson <srj@google.com> | 04 December 2020, 00:14:07 UTC |
28f9aef | Alex Reinking | 03 December 2020, 22:05:21 UTC | Enable commented clang-format option. (#5520) | 03 December 2020, 22:05:21 UTC |
759b241 | Steven Johnson | 03 December 2020, 18:04:00 UTC | Add version-checking to the clang-tidy and clang-format scripts (#5513) Using the 'wrong' version of the tools will produce results out of sync with our presubmit tests, so add checking to ensure the user has their env set up correctly. | 03 December 2020, 18:04:00 UTC |
2ddd0b0 | Steven Johnson | 03 December 2020, 02:10:58 UTC | Revert "Make context handling in GPU runtimes more consistent and robust. (#5474)" (#5515) This reverts commit f47c5c99deac86c6d1f16cfcb1743a0e9e79317d. | 03 December 2020, 02:10:58 UTC |
2c8e3ea | Steven Johnson | 03 December 2020, 02:08:31 UTC | Revert "Fix broken destroy_context() in gpu_multi_context_threaded_aottest.cpp (#5512)" (#5514) This reverts commit 445ed5ee5ba5e23efaabe0b8d6971c0678b5a569. | 03 December 2020, 02:08:31 UTC |
445ed5e | Steven Johnson | 03 December 2020, 00:35:48 UTC | Fix broken destroy_context() in gpu_multi_context_threaded_aottest.cpp (#5512) | 03 December 2020, 00:35:48 UTC |
a34d00d | Alex Reinking | 02 December 2020, 22:44:43 UTC | Adding CMake build for FFT (#5508) * Add fft build * Fix properties * Fix generator argument * Add "Success!" message to fft aot test. * Formatting. * Fix target directory for bench_fft | 02 December 2020, 22:44:43 UTC |
f47c5c9 | Zalman Stern | 02 December 2020, 22:40:21 UTC | Make context handling in GPU runtimes more consistent and robust. (#5474) This PR adds a consistent GPU compiled kernel cache across the Cuda, Direct3D, OpenCL, and Metal runtimes. This cache is robust for kernels being used across multiple contexts and threads as well as using common code via a template. OpenGL and OpenGLCompute are not addressed due to issues in their implementation. There should be no regressions for those runtimes however. Adds tests for many GPU kernels and kernels across contexts and threads. Fixes a bug in CUDA runtime where some error message text in cuda_do_multidimensional_copy was not initialized. Fixes a bug in CUDA runtime where device release code did not run if CUDA libraries are directly linked into the executable. (This would have caused crashes due to the device allocation caching among other issues.) | 02 December 2020, 22:40:21 UTC |
073b8e4 | Alex Reinking | 02 December 2020, 22:19:34 UTC | Add CMake presets for 3.19+ users (#5506) * add CMakePresets.json and update docs * fix Windows presets * remove NDEBUG from GCC options * fix typo in README | 02 December 2020, 22:19:34 UTC |
1c0f824 | Alex Reinking | 02 December 2020, 22:15:23 UTC | Restructure apps to be fully external. (#5507) * Restructure apps to be fully external. * drive-by fix default Halide_TARGET * patch up fused apps build * remove doubled line * fixing multiple import for 3.16 * fix naming convention * Add missing #include <cstdio> | 02 December 2020, 22:15:23 UTC |
329a405 | Dillon Sharlet | 02 December 2020, 18:29:08 UTC | Enable constant folding of broadcasted constants (#5500) * Enable constant folding of broadcasted constants. * Make some scalar constant folding tests vectors. * Remove excessive simplify calls causing infinite recursion. Co-authored-by: Steven Johnson <srj@google.com> | 02 December 2020, 18:29:08 UTC |
6cc24bb | Andrew Adams | 01 December 2020, 20:49:09 UTC | Fix compile time regression in fft (#5494) * Use equal instead of can_prove equality when examining enclosing scope There can be a lot of things in there, and can_prove is expensive. * Speed up bounds_of_inner_var By only expanding enclosing let stmts if the variable is actually used in the result, and by finding the last usage and then skipping anything earlier (skipping over nested producer nodes) Co-authored-by: Steven Johnson <srj@google.com> | 01 December 2020, 20:49:09 UTC |
6af4361 | Steven Johnson | 01 December 2020, 16:58:13 UTC | Fixes for trunk LLVM (#5499) | 01 December 2020, 16:58:13 UTC |
44c9a72 | Dillon Sharlet | 01 December 2020, 04:32:46 UTC | Reduce size of test image (#5496) | 01 December 2020, 04:32:46 UTC |
1ad6fb8 | Dillon Sharlet | 01 December 2020, 04:31:39 UTC | Fix case where simplifying interleaves might need a slice of the original vector (#5492) * Replace is_negative_negatable_const and associated cruft with lossless_negate. * Don't assume an interleave consumes all of the vectors it is shuffled from. * Add test of slices of interleaves. * Fix formatting * Rephrase logic. | 01 December 2020, 04:31:39 UTC |
491791d | Dillon Sharlet | 01 December 2020, 04:31:00 UTC | Simplify signed shifts more strongly (#5491) * Simplify signed shifts more strongly. * Simplify after negating b. * Also mutate other possibly simplifying cast. | 01 December 2020, 04:31:00 UTC |
960f857 | Volodymyr Kysenko | 30 November 2020, 22:58:10 UTC | Fix All value from the ValType table (#5493) | 30 November 2020, 22:58:10 UTC |
21afdc4 | Andrew Adams | 30 November 2020, 21:14:56 UTC | Align the base when doing strided loads from constant addresses (#5489) When we codegen something like f[ramp(x + 1, 2, 16)], where f is an internal allocation, we subtract the 1, do the dense load f[ramp(x, 1, 32)] and then take the odd lanes of the result. The reason for this is that it's likely that there's an f[ramp(x, 2, 16)] nearby, and aligning down the x+1 to x means we can share the dense loads and just deinterleave. This PR does the same when there's no x, just an odd constant. This means that cases like f[ramp(64, 2, 16)] + f[ramp(65, 2, 16)] now generate much better assembly. In one case I have it speeds up an entire pipeline by 8%, because aligning the loads in this way causes them to all be promoted off the stack into registers. | 30 November 2020, 21:14:56 UTC |
226b12c | Steven Johnson | 30 November 2020, 19:12:07 UTC | Improve speed of testing apps/ (#5482) * Improve speed of testing apps/ - Skip all app tests that are labeled as 'benchmarks' - Specify `--build-noclean` to avoid unnecessary full rebuilds * Change label 'benchmark' -> 'slow_tests' | 30 November 2020, 19:12:07 UTC |
16929df | Dillon Sharlet | 30 November 2020, 18:27:56 UTC | Add Type::widen and Type::narrow helpers. (#5478) * Add Type::widen and Type::narrow helpers. * widen -> wide, more uses of wide. * wide back to widen. Co-authored-by: Dillon Sharlet <dsharlet@gmail.com> | 30 November 2020, 18:27:56 UTC |
78489d0 | Dillon Sharlet | 30 November 2020, 16:15:16 UTC | Small cleanups/fixes (#5479) * Small cleanups/fixes peeled from lower-patterns2. * Fix derp * Fix possibly undefined evaluation order. * Smaller code. * Work around test issue. | 30 November 2020, 16:15:16 UTC |
49ca720 | Dillon Sharlet | 30 November 2020, 15:43:18 UTC | Replace is_negative_negatable_const and more logic with lossless_negate (#5490) * Replace is_negative_negatable_const and associated cruft with lossless_negate. * Add comment | 30 November 2020, 15:43:18 UTC |
bfbfacd | Dillon Sharlet | 27 November 2020, 20:31:02 UTC | Revert formatting of Hexagon intrinsic table (#5484) * Revert formatting of Hexagon intrinsic table * Revert one extra find and replace. | 27 November 2020, 20:31:02 UTC |
f911a89 | Dillon Sharlet | 26 November 2020, 07:40:25 UTC | Add as_intrinsic helper (#5480) * Add as_intrinsic helper. * Rename calls of known intrinsics. * Fix check_sio. | 26 November 2020, 07:40:25 UTC |
59bbc4d | Dillon Sharlet | 25 November 2020, 19:36:02 UTC | Simplify intrinsics of broadcasts to broadcasts of intrinsics (#5473) * Simplify intrinsics of broadcasts to broadcasts of intrinsics. * Fix broadcast elementwise simplifications for nested broadcasts. * broadcasted -> broadcast. | 25 November 2020, 19:36:02 UTC |
3cb2adb | Steven Johnson | 24 November 2020, 21:52:00 UTC | Improvements to HalideTraceViz (#5466) - Handle 4D inputs more gracefully - Improve horizontal squishing of long labels | 24 November 2020, 21:52:00 UTC |
87c9fac | Alex Reinking | 24 November 2020, 00:32:49 UTC | Fail CMake when LLVM_LINK_LLVM_DYLIB conflicts with wasm (#5472) * Fail CMake when LLVM_LINK_LLVM_DYLIB conflicts with wasm * Update error message and add comment. | 24 November 2020, 00:32:49 UTC |
31e9687 | Alexander Root | 23 November 2020, 23:01:04 UTC | Remove AndConditionOverDomain and fix Interval::everything() uses in Bounds (#5455) * rm AndConditionOverDomain and fix Interval::everything() uses in Bounds * fix clang-tidy complaint * rm unnecessary/irrelevant comment * nit: add line break | 23 November 2020, 23:01:04 UTC |