Revision - d76970a - Fix apps/HelloPyTorch - origin: https://github.com/halide/Halide

visit type:

https://github.com/halide/Halide

19 April 2024, 08:20:39 UTC

Code
Branches (796)
Releases (1)
Visits

Revision d76970aa081df7d30b43a22295b02be759aae93c authored by Steven Johnson on 09 February 2021, 22:32:19 UTC, committed by Steven Johnson on 09 February 2021, 22:32:19 UTC

Fix apps/HelloPyTorch

1 parent fe0888b

Files
Changes

Branches
Releases

HEAD
refs/heads/Halide_unsharp
refs/heads/abadams/aggressive_is_single_point
refs/heads/abadams/align_strided_const_loads
refs/heads/abadams/alloca
refs/heads/abadams/atomic_parallel_compiled_in
refs/heads/abadams/atomic_vector_non_recursive
refs/heads/abadams/averaging_tree
refs/heads/abadams/avoid_name_mangling_in_cross_module_dependencies
refs/heads/abadams/better_absd
refs/heads/abadams/better_codegen_for_non_const_ramps
refs/heads/abadams/bgu_cholesky
refs/heads/abadams/braces_around_statements
refs/heads/abadams/cache_tighten_producer_consumer_nodes
refs/heads/abadams/check_reorder_dups
refs/heads/abadams/clarify_broadcast_shuffle
refs/heads/abadams/compositing_app
refs/heads/abadams/cond_wait_spin
refs/heads/abadams/cse_in_unroll_split_tuples
refs/heads/abadams/custom_cuda_context
refs/heads/abadams/custom_cuda_context_2
refs/heads/abadams/custom_cuda_context_3
refs/heads/abadams/d3d12abi
refs/heads/abadams/deflake_mullapudi_reorder
refs/heads/abadams/delete_prepare_for_early_exit
refs/heads/abadams/depthwise_separable_conv
refs/heads/abadams/diagnose_boundary_condition_failure
refs/heads/abadams/disable_onnx_app_on_mac
refs/heads/abadams/divide_using_pavgw
refs/heads/abadams/dont_link_to_cudart
refs/heads/abadams/dont_reinterpret_concat
refs/heads/abadams/early_out
refs/heads/abadams/enable_f16c
refs/heads/abadams/extract_concat_bits
refs/heads/abadams/fast_integer_divide_round_to_zero
refs/heads/abadams/faster_runtime_integer_division
refs/heads/abadams/faster_substitute_facts
refs/heads/abadams/faster_unroll
refs/heads/abadams/fix-arm-seg2
refs/heads/abadams/fix_4211
refs/heads/abadams/fix_5323
refs/heads/abadams/fix_5329
refs/heads/abadams/fix_5889
refs/heads/abadams/fix_6984
refs/heads/abadams/fix_7229
refs/heads/abadams/fix_7260
refs/heads/abadams/fix_7365
refs/heads/abadams/fix_7374
refs/heads/abadams/fix_7504
refs/heads/abadams/fix_7514
refs/heads/abadams/fix_7531
refs/heads/abadams/fix_7584
refs/heads/abadams/fix_7584_v2
refs/heads/abadams/fix_7742
refs/heads/abadams/fix_7756
refs/heads/abadams/fix_7761
refs/heads/abadams/fix_7768
refs/heads/abadams/fix_7786
refs/heads/abadams/fix_7810
refs/heads/abadams/fix_7811
refs/heads/abadams/fix_7815
refs/heads/abadams/fix_7867
refs/heads/abadams/fix_7871
refs/heads/abadams/fix_7872
refs/heads/abadams/fix_7873
refs/heads/abadams/fix_7888
refs/heads/abadams/fix_7890
refs/heads/abadams/fix_7891
refs/heads/abadams/fix_7892
refs/heads/abadams/fix_7893
refs/heads/abadams/fix_7906
refs/heads/abadams/fix_7909
refs/heads/abadams/fix_7968
refs/heads/abadams/fix_8038
refs/heads/abadams/fix_8054
refs/heads/abadams/fix_8170
refs/heads/abadams/fix_8184
refs/heads/abadams/fix_arm_fcvtmp
refs/heads/abadams/fix_autoschedule_feature_transposition
refs/heads/abadams/fix_cse_name_collisions
refs/heads/abadams/fix_cuda_mat_mul_assert
refs/heads/abadams/fix_deinterleave_bug
refs/heads/abadams/fix_deinterleave_for_reinterpret
refs/heads/abadams/fix_div_round_to_zero
refs/heads/abadams/fix_fft_compile_time_regression
refs/heads/abadams/fix_generate_output_snippets
refs/heads/abadams/fix_if_nesting_condition
refs/heads/abadams/fix_leaks_in_memoize_test
refs/heads/abadams/fix_lgtm_warnings
refs/heads/abadams/fix_links_to_master
refs/heads/abadams/fix_load_of_broadcast
refs/heads/abadams/fix_lossless_cast_of_sub
refs/heads/abadams/fix_onnx_app
refs/heads/abadams/fix_pointless_lower_condition
refs/heads/abadams/fix_potential_gpu_deadlock
refs/heads/abadams/fix_realize_condition_depends_on_tuple
refs/heads/abadams/fix_reduce_expr_modulo_of_vector
refs/heads/abadams/fix_riscv_vx_vi
refs/heads/abadams/fix_round
refs/heads/abadams/fix_stencil_chain_gpu_schedule
refs/heads/abadams/fix_track_bounds_intervals
refs/heads/abadams/fix_tutorial_2
refs/heads/abadams/fix_ub_in_lower_rounding_shift_right
refs/heads/abadams/forward_partition_methods
refs/heads/abadams/fully_fused_depthwise_separable_conv
refs/heads/abadams/fuzz_sliding_window
refs/heads/abadams/gaussian_blur_app
refs/heads/abadams/generator_infinite_default_timeout
refs/heads/abadams/gpu_autoscheduler_parallel_random_probes
refs/heads/abadams/include_riscv_in_readme
refs/heads/abadams/interleave_nested_vector
refs/heads/abadams/ir_match_by_ref
refs/heads/abadams/lerp_plus_cast
refs/heads/abadams/local_laplacian_code_size
refs/heads/abadams/lower_halving_sub
refs/heads/abadams/lower_rounding_shift_right
refs/heads/abadams/mac-arm-fixes
refs/heads/abadams/make_fast_inverse_test_throughput_limited
refs/heads/abadams/makefile_serialization_support
refs/heads/abadams/mismatched_new_delete
refs/heads/abadams/mixed_sign_mul_shift_right
refs/heads/abadams/mixed_width_mul_shift_right
refs/heads/abadams/multiple_scatter
refs/heads/abadams/mux_intrinsic
refs/heads/abadams/name_helpers
refs/heads/abadams/narrow_predicates
refs/heads/abadams/nested_vectorization_compile_time_regression_fix
refs/heads/abadams/nested_vectorization_tweaks
refs/heads/abadams/parallel_simd_op_check
refs/heads/abadams/per_instance_profiling
refs/heads/abadams/precompute_shared_mem_size
refs/heads/abadams/prefer_no_gather
refs/heads/abadams/print_uncaught_exception
refs/heads/abadams/promote_fixed_point_intrinsics
refs/heads/abadams/psabdw
refs/heads/abadams/random_pipelines
refs/heads/abadams/rationalize_gpu_for_loop_names
refs/heads/abadams/reenable_unscheduled_stage_warning
refs/heads/abadams/refactor_constant_interval
refs/heads/abadams/reinterpret_vector
refs/heads/abadams/remove_arch_os_for_shaders
refs/heads/abadams/remove_bad_pruning
refs/heads/abadams/remove_parameter_self_references
refs/heads/abadams/remove_readnone_on_functions
refs/heads/abadams/remove_use_of_python_config_in_onnx_makefile
refs/heads/abadams/reschedule_bgu
refs/heads/abadams/reschedule_bilateral_grid
refs/heads/abadams/rewrite_atomic_pass
refs/heads/abadams/rewrite_ir_equality
refs/heads/abadams/rounding_shift_right_use_average
refs/heads/abadams/rungenmain_error
refs/heads/abadams/sampling_profiler_overhead_v2
refs/heads/abadams/scope_improvements
refs/heads/abadams/simpler_broadcasts
refs/heads/abadams/simplify_correlated_pyramid
refs/heads/abadams/siotas_20
refs/heads/abadams/sioutas_20
refs/heads/abadams/slide_over_split_loop
refs/heads/abadams/sorting_network_working_branch
refs/heads/abadams/stable_topological_order
refs/heads/abadams/string_view
refs/heads/abadams/strip_asserts_last
refs/heads/abadams/switch_stmt
refs/heads/abadams/target_specific_lerp
refs/heads/abadams/time_lowering_passes
refs/heads/abadams/track_failedness_through_solver_lets
refs/heads/abadams/turn_off_slp_vectorization_for_avx512
refs/heads/abadams/tweak_unpack_buffers
refs/heads/abadams/undo_pointless_widening
refs/heads/abadams/unordered_blocks
refs/heads/abadams/unsigned_demosaic
refs/heads/abadams/update_makefile_for_llvm_19
refs/heads/abadams/use_arm_for_runtime_triple
refs/heads/abadams/use_pmaddubsw_for_downsample
refs/heads/abadams/validate_gpu_schedules
refs/heads/abadams/vector_reduce_hexagon_predicate
refs/heads/abadams/vector_scan
refs/heads/abadams/vst_type_fix
refs/heads/abadams/widening_let_bug
refs/heads/abadams/x86_avg
refs/heads/abadams/zen4
refs/heads/adadams/profile_allocator
refs/heads/add_image_checks_after_bounds_inference_plus_new_rules
refs/heads/add_outermost_to_extern
refs/heads/add_vectorization_to_search_space
refs/heads/aelphy/feature_cadence_changes
refs/heads/aelphy/float_extracts
refs/heads/align_loads_comment_fix
refs/heads/alina-strided-store
refs/heads/another_buffer_copy_fix
refs/heads/arm_sve_redux
refs/heads/ataei-block_asserts-codegen
refs/heads/ataei-debug_info
refs/heads/ataei-fix-pow
refs/heads/ataei-gen_str_param
refs/heads/ataei-implicit_lhs_vars
refs/heads/ataei-onnx
refs/heads/ataei-onnx_converter_update
refs/heads/ataei-onnx_pybind
refs/heads/ataei-resnet50_benchmarks
refs/heads/ataei-standalone_autoscheduler
refs/heads/ataei_lots_of_inputs
refs/heads/auto_sched_benchmarks
refs/heads/auto_sched_estimates
refs/heads/auto_sched_inline
refs/heads/auto_sched_test_notparallel
refs/heads/autoschedule_top_down
refs/heads/autoschedule_with_convnet
refs/heads/autoscheduler_scalar_imageparam_fix
refs/heads/backports/10.x
refs/heads/backports/11.x
refs/heads/backports/12.x
refs/heads/backports/13.x
refs/heads/balance_expressions
refs/heads/bazel
refs/heads/benchmarks
refs/heads/blaze
refs/heads/bounds_buffer_lets_fix
refs/heads/bounds_correct_vs_bounds_loaded_reduced
refs/heads/buffer_device_api_target
refs/heads/bug_device_free
refs/heads/bug_inline_unbounded
refs/heads/build/fix-xcode-2
refs/heads/build/manylinux-fixes
refs/heads/circ_buffer
refs/heads/cmake-no-runtime-debug-symbols
refs/heads/cmake/asan
refs/heads/cmake/deps-cleanup
refs/heads/cmake/find-modules
refs/heads/cmake/spirv
refs/heads/cmake_wasm_features
refs/heads/compute_at_guard_with_if_goes_on_stack
refs/heads/compute_with_at
refs/heads/compute_with_check
refs/heads/compute_with_excessive_bounds
refs/heads/compute_with_inlined
refs/heads/compute_with_remove_is_right_level
refs/heads/cpack/nuget
refs/heads/ctest/wrappers
refs/heads/cuda-constant
refs/heads/d3d12-allocation-cache
refs/heads/deferred_cse_after_inlining
refs/heads/destructor_calls_deinit
refs/heads/dg/deserialize_unmapped_objects
refs/heads/dg/fix_vulkan_codegen_bool_conversion
refs/heads/dg/vulkan_conform_api
refs/heads/dg/vulkan_region_allocator_fixes
refs/heads/dgerstmann/fix-vulkan-memory-config-init
refs/heads/disable_acquire_release_test_vulkan
refs/heads/distinct_wrapper_names
refs/heads/dkg/6863_asan_fixes
refs/heads/dkg/vulkan
refs/heads/dpalermo_dmabuf
refs/heads/dpalermo_dmabuf_libion
refs/heads/dpalermo_hexagon_remote_202003
refs/heads/dpalermo_sdk4_2_0_2
refs/heads/ds/buffer-get-pure
refs/heads/ds/opt-tile-size
refs/heads/ds/tail-none
refs/heads/ds/while
refs/heads/dsharletg/bitwise-intrinsics
refs/heads/dsharletg/find-vector-reduce
refs/heads/dsharletg/jit-optimization
refs/heads/dsharletg/memcpy-copy_from
refs/heads/dsharletg/pattern-headroom
refs/heads/dsharletg/refactor-host-alignment
refs/heads/dsharletg/runtime-size
refs/heads/dsharletg/simplify-abs
refs/heads/dsharletg/simplify-type-bounds
refs/heads/dsharletg/specialize-bounds
refs/heads/dsharletg/upsample-channels
refs/heads/empty_prefetch
refs/heads/emscripten_vector_fix
refs/heads/export_all-wsmoses
refs/heads/expr_auto_sched
refs/heads/extern_bugs
refs/heads/extern_host_alloc
refs/heads/factor_parallel_codegen_hack
refs/heads/fast_sync_tsan
refs/heads/faster_integer_division
refs/heads/feature/apps-external
refs/heads/feature/cmake-presets
refs/heads/feature/convert
refs/heads/feature/f16_interleave
refs/heads/feature/gather_load_q7
refs/heads/feature/llvm-codemodel
refs/heads/feature/load_predicated
refs/heads/feature/luma_regression
refs/heads/feature/maintanence
refs/heads/feature/reinterprets
refs/heads/feature/tcm_bump_allocator
refs/heads/feature/xtensa_fix_interleave_q8
refs/heads/feature/xtensa_q8_tests
refs/heads/find_intrinsics_issue
refs/heads/find_intrinsics_widening_lets
refs/heads/fix-floated-pure-stage
refs/heads/fix-race-condition
refs/heads/fix_hexagon_alignment
refs/heads/fix_hvx_intrinsics
refs/heads/fix_prefetch_test
refs/heads/fix_windows_vs15_build
refs/heads/fixed_length_vectors
refs/heads/fixed_point_local_laplac
refs/heads/gemmlowp
refs/heads/generate
refs/heads/gha/pip
refs/heads/gpu_canon_fix
refs/heads/halide_ir_flatbuffer
refs/heads/hex_dma2_async
refs/heads/hexagon_le_runtime
refs/heads/hexagon_priority
refs/heads/hexagon_setpriority
refs/heads/hexagon_strided_pred_load
refs/heads/hexagon_sysmon_markers
refs/heads/imaging-synthesis
refs/heads/includes_fix
refs/heads/ios_fast_sync_fix
refs/heads/jia-kai-fix-runtime-cuda-init
refs/heads/kamil-openglcompute-infinity
refs/heads/kamil/name_pthread_workers
refs/heads/kp_bit_shift
refs/heads/line_buffer
refs/heads/loop_carry_not_working
refs/heads/lower_on_huge_stack
refs/heads/main
refs/heads/master
refs/heads/memoize_with_extents
refs/heads/metal_float16
refs/heads/metaprogrammed_simplifier_mod
refs/heads/mohamedadaly-vmlal
refs/heads/more_powerful_sliding
refs/heads/new_autoschedule_with_new_simplifier_arm_worker_branch
refs/heads/new_autoscheduler
refs/heads/new_simplifier_rule_testing
refs/heads/newer_ion_ioctl
refs/heads/no_bounds_query_when_bounds_used
refs/heads/opengl_compute_buffer_types_fix
refs/heads/openglcompute_reuse_shared_allocations
refs/heads/optmize_reorder
refs/heads/par_for_opt
refs/heads/pdb/fix_7806
refs/heads/pdb/hexagon_remote_cmake
refs/heads/pdb_add_libcpp_makefile_inc
refs/heads/pdb_eliminate_interleaves_test
refs/heads/pdb_fix_clang_build
refs/heads/pdb_fix_install_qc
refs/heads/pdb_fix_loop_carry
refs/heads/pdb_fix_simd_op_check_hvx
refs/heads/pdb_mul_div_mod_multi_thread
refs/heads/pdb_remove_hvx_v64
refs/heads/perform_inline_with_order
refs/heads/pr/2572
refs/heads/pr/2676
refs/heads/pr/2975
refs/heads/pr/3017
refs/heads/pr/3081
refs/heads/pr/3387
refs/heads/pr/3939
refs/heads/pr/3960
refs/heads/pr/4380
refs/heads/pr/4414
refs/heads/pr/5331
refs/heads/pr/5438
refs/heads/pr/5455
refs/heads/pr/5758_2
refs/heads/predicated_vector
refs/heads/prefetch_specialize
refs/heads/print_schedule
refs/heads/profile_hardware_counters
refs/heads/random-pipelines
refs/heads/rdom_with_pure_vars
refs/heads/readme-fix-gcd
refs/heads/realization_order
refs/heads/refactor_module
refs/heads/register_promotion
refs/heads/release/10.x
refs/heads/release/11.x
refs/heads/release/12.x
refs/heads/release/13.x
refs/heads/release/14.x
refs/heads/release/15.x
refs/heads/release/16.x
refs/heads/release/17.x
refs/heads/release/8.x
refs/heads/remove_max_on_fuse_factor
refs/heads/reorder_rvar
refs/heads/reset_unique_counter
refs/heads/revert-3612-ataei-speedup_compiletime
refs/heads/revert-7009-rootjalex/distribute-w_shl
refs/heads/revert-7601-compile_hexagon_remote
refs/heads/riscv_update
refs/heads/rl_simplifier_rules
refs/heads/rootjalex/add_simpl_rules
refs/heads/rootjalex/arm-optimize
refs/heads/rootjalex/autoscheduler_mcts
refs/heads/rootjalex/bounds-rewriter
refs/heads/rootjalex/bounds_synthesis
refs/heads/rootjalex/cbounds
refs/heads/rootjalex/cbounds_predicated
refs/heads/rootjalex/fix-sat-overflow
refs/heads/rootjalex/fix_estimate_issue
refs/heads/rootjalex/fix_failed_unrolls
refs/heads/rootjalex/gsoc_codegen
refs/heads/rootjalex/improve_cbounds_fixed
refs/heads/rootjalex/improve_constant_bounds
refs/heads/rootjalex/pitchfork-arm
refs/heads/rootjalex/reinterpret-simplify
refs/heads/rootjalex/rts
refs/heads/rootjalex/super_simplify_bounds
refs/heads/rootjalex/test_cbounds_fixed
refs/heads/rootjalex/test_constant_bounds
refs/heads/rootjalex/trs-codegen
refs/heads/rootjalex/trs-codegen-cross
refs/heads/rootjalex/trs-merge
refs/heads/rootjalex/uint32-int32-cast
refs/heads/rootjalex/x86-hadds
refs/heads/rootjalex/x86-optimize
refs/heads/rootjalex/x86-optimize-test
refs/heads/rootjalex/x86-sat
refs/heads/rootjalex/x86-test
refs/heads/rule_removal_experiments
refs/heads/schedule-output-storage
refs/heads/separate_bounds_query_entrypoint
refs/heads/shallow
refs/heads/shift_amount_type_change
refs/heads/shoaibkamil/cmake-without-arm
refs/heads/shoaibkamil/correct_memory_fences
refs/heads/shoaibkamil/d3d-fixes
refs/heads/shoaibkamil/deprecate_openglcompute
refs/heads/shoaibkamil/json
refs/heads/shoaibkamil/llvm_clone_tag
refs/heads/shoaibkamil/minor-vcpkg-doc-change
refs/heads/shoaibkamil/opengl_compute_tests
refs/heads/shoaibkamil/performance_tests_as_generators
refs/heads/shoaibkamil/rule_removal_experiments
refs/heads/shoaibkamil/super_simplify_with_interpreter
refs/heads/shoaibkamil/windows-arm-fix-attributes
refs/heads/sim_shlib_addr_print
refs/heads/simplify-nested-broadcasts
refs/heads/simplify-vectorreduce-shuffles2
refs/heads/simplify_mod
refs/heads/sioutas_2020
refs/heads/sioutas_2020_autoscheduler
refs/heads/slomp/gpu-codegen-profiling
refs/heads/slomp/msvc-static-analysis
refs/heads/solve_div
refs/heads/solve_div_master
refs/heads/solve_div_simplifier_test
refs/heads/sr/python-late-binding-defaults
refs/heads/srj-aaa
refs/heads/srj-alloc
refs/heads/srj-alloca
refs/heads/srj-appmake2
refs/heads/srj-armv83a
refs/heads/srj-aslog
refs/heads/srj-assert
refs/heads/srj-assoc
refs/heads/srj-auto-multi
refs/heads/srj-auto-multi2
refs/heads/srj-auto_schedule_mat_mul
refs/heads/srj-autosched
refs/heads/srj-b2cpphide
refs/heads/srj-barr
refs/heads/srj-bits
refs/heads/srj-blacklist
refs/heads/srj-bounds
refs/heads/srj-bufcalltype
refs/heads/srj-bufcallwrap
refs/heads/srj-bufcallwrap2
refs/heads/srj-buffer
refs/heads/srj-bv
refs/heads/srj-classic-autotune
refs/heads/srj-clean
refs/heads/srj-constcall
refs/heads/srj-crosscompile
refs/heads/srj-ctlz
refs/heads/srj-cvec-patch
refs/heads/srj-dag
refs/heads/srj-debug-to-file
refs/heads/srj-deir
refs/heads/srj-f16
refs/heads/srj-fp16
refs/heads/srj-fsch
refs/heads/srj-fthru
refs/heads/srj-g2
refs/heads/srj-g3
refs/heads/srj-gha-test-fixes
refs/heads/srj-hidden
refs/heads/srj-hide2
refs/heads/srj-hvx
refs/heads/srj-hvx-bug
refs/heads/srj-hvx-codegen-bug
refs/heads/srj-hvx-nocopy
refs/heads/srj-hvxshift
refs/heads/srj-iib
refs/heads/srj-initshape
refs/heads/srj-inv
refs/heads/srj-ir
refs/heads/srj-irmut2
refs/heads/srj-iwyu
refs/heads/srj-iwyu3
refs/heads/srj-javascript_work_in_progress
refs/heads/srj-lensblur
refs/heads/srj-lessinc
refs/heads/srj-llvm-loop-opt
refs/heads/srj-mak
refs/heads/srj-maxthreads
refs/heads/srj-mod
refs/heads/srj-msan
refs/heads/srj-msan-call
refs/heads/srj-muldivmod
refs/heads/srj-mut
refs/heads/srj-outputs-2
refs/heads/srj-parse
refs/heads/srj-pch
refs/heads/srj-printfunc
refs/heads/srj-pygp
refs/heads/srj-revertbits
refs/heads/srj-schedule-storage
refs/heads/srj-shl-shr-2
refs/heads/srj-sio
refs/heads/srj-static-const
refs/heads/srj-strided-store
refs/heads/srj-tidyh
refs/heads/srj-tiff
refs/heads/srj-trace
refs/heads/srj-tutorial
refs/heads/srj-using
refs/heads/srj-wasmfix
refs/heads/srj-xor2
refs/heads/srj/abstract-gen-without-get-output-func-KEEP
refs/heads/srj/aligned-alloc
refs/heads/srj/aligned-alloc-2
refs/heads/srj/aligned-malloc-with-aligned-alloc
refs/heads/srj/all-explicit-ctor
refs/heads/srj/anderson-thread-info-ptr
refs/heads/srj/aot-perf
refs/heads/srj/apps-hamal
refs/heads/srj/argv-signatures
refs/heads/srj/argv-types
refs/heads/srj/async-test
refs/heads/srj/b2cpp-const-data
refs/heads/srj/better-xt-dispatch
refs/heads/srj/bfloat1
refs/heads/srj/bp
refs/heads/srj/build_halide_h
refs/heads/srj/c-bool
refs/heads/srj/cache-clear
refs/heads/srj/clang-fmt-ignore
refs/heads/srj/clang-tidy
refs/heads/srj/clear-c-cache
refs/heads/srj/cmake-asan
refs/heads/srj/cmake-asan2
refs/heads/srj/cmake-jit-generators
refs/heads/srj/configure-cmake
refs/heads/srj/cpp-generator-v2-experiment-KEEP
refs/heads/srj/crosscompile
refs/heads/srj/csv
refs/heads/srj/ctad
refs/heads/srj/debug-to-file-api
refs/heads/srj/depr
refs/heads/srj/deprecation
refs/heads/srj/device-copy
refs/heads/srj/example
refs/heads/srj/experiment
refs/heads/srj/experiment-6967
refs/heads/srj/exporting
refs/heads/srj/expr_t
refs/heads/srj/external-tensors
refs/heads/srj/f16-convert
refs/heads/srj/fix-pytorch
refs/heads/srj/fixed-rollback
refs/heads/srj/fopen-fix
refs/heads/srj/forward
refs/heads/srj/forward-name
refs/heads/srj/gen-func
refs/heads/srj/gen-func-2
refs/heads/srj/gen-func-3
refs/heads/srj/gen2-1
refs/heads/srj/gen_closure
refs/heads/srj/generator_aot_gpu_multi_context_threaded
refs/heads/srj/globals
refs/heads/srj/halide-buffer-crop
refs/heads/srj/halide-malloc-alignment
refs/heads/srj/halide-must-use
refs/heads/srj/halide-runtime-must-use-result
refs/heads/srj/hang-repro
refs/heads/srj/hannk
refs/heads/srj/hannk-aliasing
refs/heads/srj/hannk-error-checking
refs/heads/srj/hannk-errors
refs/heads/srj/hannk-inplace
refs/heads/srj/hannk-mmap
refs/heads/srj/hannk-tflite-27
refs/heads/srj/hannk-verbosity
refs/heads/srj/hdrs
refs/heads/srj/html-becomes-viz
refs/heads/srj/implicit-mult-widening
refs/heads/srj/issue-7076
refs/heads/srj/iwyu
refs/heads/srj/iwyu-2
refs/heads/srj/iwyu-6
refs/heads/srj/libHANNK
refs/heads/srj/llvm_type_of
refs/heads/srj/maybe-unused
refs/heads/srj/meanop
refs/heads/srj/metadata-calling-convention
refs/heads/srj/more-tidy
refs/heads/srj/msan-dtf
refs/heads/srj/multimeta
refs/heads/srj/nanobind
refs/heads/srj/new-rt-1
refs/heads/srj/no-threadpool
refs/heads/srj/no-timeout-thread
refs/heads/srj/oglc-mutexed
refs/heads/srj/param-map
refs/heads/srj/pip-15.x
refs/heads/srj/pip-cron
refs/heads/srj/possible-uninited
refs/heads/srj/pr-7566
refs/heads/srj/printer-size
refs/heads/srj/profiler-data-race
refs/heads/srj/ptr-int-cast
refs/heads/srj/pyapps
refs/heads/srj/pyext-fix
refs/heads/srj/pygen-class
refs/heads/srj/pygen-deux
refs/heads/srj/pygen-func
refs/heads/srj/pygen-native-types
refs/heads/srj/pyinstall
refs/heads/srj/pypi-try
refs/heads/srj/pystuff
refs/heads/srj/python-buffer-unpack
refs/heads/srj/python-tutorial
refs/heads/srj/reshape
refs/heads/srj/rt-error-smallify
refs/heads/srj/rt-return-types
refs/heads/srj/runtime-error-handling
refs/heads/srj/sat-fixes-exp
refs/heads/srj/sat-fixes-exp-2
refs/heads/srj/shadow-field
refs/heads/srj/snprintf
refs/heads/srj/spirv-license
refs/heads/srj/stat-buf-deprecations
refs/heads/srj/static-buffer-generators
refs/heads/srj/stmt-html
refs/heads/srj/stringify
refs/heads/srj/synth-gen-params
refs/heads/srj/synth-params-python
refs/heads/srj/test-arm_sve_redux
refs/heads/srj/test-intrinsics-bounds
refs/heads/srj/test8076
refs/heads/srj/test8078
refs/heads/srj/test8094
refs/heads/srj/test8105a
refs/heads/srj/test8115
refs/heads/srj/test_tmpdir_fix
refs/heads/srj/tidy
refs/heads/srj/tidy-format-14
refs/heads/srj/tidymore
refs/heads/srj/tidymore2
refs/heads/srj/tls
refs/heads/srj/tls-3
refs/heads/srj/tls-4
refs/heads/srj/tls-ucon
refs/heads/srj/tmp-unschedule-experiment
refs/heads/srj/tot-fix
refs/heads/srj/try-revert-sat
refs/heads/srj/type-traits
refs/heads/srj/typed-func
refs/heads/srj/ucon-all-const
refs/heads/srj/ucon-non-const
refs/heads/srj/visit-warnings
refs/heads/srj/wasm-atomic2
refs/heads/srj/wasm-simd
refs/heads/srj/wasm-stuff
refs/heads/srj/wasm-threads
refs/heads/srj/wasm-updates
refs/heads/srj/wasm-work
refs/heads/srj/wip
refs/heads/srj/x-rounding
refs/heads/srj/xbuf
refs/heads/srj/xc+plus+size+tmp
refs/heads/srj/xc-types
refs/heads/srj/xt-uint-cast-test
refs/heads/srj/xtensa-arch
refs/heads/srj/xtensa-merge
refs/heads/srj/xvc-experimetn
refs/heads/srj/zlib-embed
refs/heads/standalone_autoscheduler
refs/heads/standalone_autoscheduler_arm_worker
refs/heads/standalone_autoscheduler_arm_worker_amazon
refs/heads/standalone_autoscheduler_gpu
refs/heads/standalone_autoscheduler_hexagon
refs/heads/sticky_task_assignments
refs/heads/store_with
refs/heads/store_with_solver_for_super_simplify
refs/heads/strict_float_cse_fix
refs/heads/super_simplify
refs/heads/super_simplify_v2
refs/heads/super_simplify_v3
refs/heads/transitive_wrapper
refs/heads/trigger-release-v16
refs/heads/tzumao-autodiff-boundarycond
refs/heads/tzumao-gradient-autoscheduler-bug
refs/heads/tzumao-predicate-store-load
refs/heads/tzumao-python-buffer
refs/heads/tzumao_autodiff_unbounded
refs/heads/tzumao_improve_gradient_autoscheduler
refs/heads/tzumao_issue_4297
refs/heads/tzumao_licm_before_BI
refs/heads/unbounded_bugs
refs/heads/undo_async_copy_chain_black_list
refs/heads/use_string_literals_for_blobs
refs/heads/users/lukas/python-pip
refs/heads/validate_sched_error_msg
refs/heads/var_ir_fix
refs/heads/vksnk/async-experiment
refs/heads/vksnk/async-multiple-producers
refs/heads/vksnk/async-order
refs/heads/vksnk/better-loop-carry
refs/heads/vksnk/better-message
refs/heads/vksnk/bound-storage
refs/heads/vksnk/bounds-widen-right
refs/heads/vksnk/c-print-type
refs/heads/vksnk/c-round
refs/heads/vksnk/check-return-result
refs/heads/vksnk/compute-with-bug
refs/heads/vksnk/compute_with_async
refs/heads/vksnk/dma-limit-channels
refs/heads/vksnk/dma-min-max
refs/heads/vksnk/expr-match-shuffle
refs/heads/vksnk/extract-from-scalar
refs/heads/vksnk/f16-load
refs/heads/vksnk/fix-packvr
refs/heads/vksnk/fix_halide_xtensa_narrow_with_rounding_shift_i16
refs/heads/vksnk/fused-compute-with
refs/heads/vksnk/hoist-storage-bug
refs/heads/vksnk/lerp-intrinsics
refs/heads/vksnk/lower-signed-shifts
refs/heads/vksnk/missing-exception
refs/heads/vksnk/non-widening-halves
refs/heads/vksnk/optimize-shuffles
refs/heads/vksnk/replace-all
refs/heads/vksnk/restrict
refs/heads/vksnk/roll-buffer
refs/heads/vksnk/roundeven-arm
refs/heads/vksnk/rvar-bounds
refs/heads/vksnk/simplify-slice
refs/heads/vksnk/skip-semaphores
refs/heads/vksnk/storage-folding
refs/heads/vksnk/strided-load-of-4_2
refs/heads/vksnk/typed-scope
refs/heads/vksnk/update-simd-driver
refs/heads/vksnk/vectorize-bug
refs/heads/vksnk/vectorize-scalarize
refs/heads/vksnk/widening_absd
refs/heads/vksnk/xtensa-codegen-fp16
refs/heads/vksnk/xtensa-dma-improvements
refs/heads/vksnk/xtensa-regroup-pass
refs/heads/vksnk/xtensa/lift-allocs
refs/heads/vulkan
refs/heads/vulkan-diagnose-alloc-failures
refs/heads/vulkan-phase0-adts
refs/heads/vulkan-phase1-spirv
refs/heads/vulkan-phase2-runtime
refs/heads/vulkan2
refs/heads/vulkan_fix_gpu_dynamic_shared_test
refs/heads/vulkan_fix_subregion_memory_offsets
refs/heads/webassembly-old
refs/heads/winograd
refs/heads/wording_fix
refs/heads/xtensa-codegen
refs/heads/xtensa-codegen-parallel
refs/heads/xuanda/fix-serialize-bad-partition-always
refs/remotes/origin/rootjalex/add_autosched_caching
refs/tags/release_2018_02_15
refs/tags/release_2019_08_27
refs/tags/release_8.0.0
refs/tags/v10.0.0
refs/tags/v10.0.1
refs/tags/v11.0.0
refs/tags/v11.0.1
refs/tags/v12.0.0
refs/tags/v12.0.1
refs/tags/v13.0.0
refs/tags/v13.0.1
refs/tags/v13.0.2
refs/tags/v13.0.3
refs/tags/v13.0.4
refs/tags/v14.0.0
refs/tags/v15.0.0
refs/tags/v15.0.1
refs/tags/v16.0.0
refs/tags/v17.0.0
refs/tags/v17.0.1
refs/tags/v8.0.0
d76970aa081df7d30b43a22295b02be759aae93c

release_2013_11_11

2cc43eb
/
src
/
FuseGPUThreadLoops.h

Raw File

Cook and download a directory from the Software Heritage Vault

You have requested the cooking of the directory with identifier None into a standard tar.gz archive.

Are you sure you want to continue ?

(Optional) Send download link once it is available to that email address:

Download a directory from the Software Heritage Vault

You have requested the download of the directory with identifier None as a standard tar.gz archive.

Are you sure you want to continue ?

Cook and download a revision from the Software Heritage Vault

You have requested the cooking of the history heading to revision with identifier swh:1:rev:d76970aa081df7d30b43a22295b02be759aae93c into a bare git archive.

Are you sure you want to continue ?

(Optional) Send download link once it is available to that email address:

Download a revision from the Software Heritage Vault

You have requested the download of the history heading to revision with identifier swh:1:rev:d76970aa081df7d30b43a22295b02be759aae93c as a bare git archive.

Are you sure you want to continue ?

Invalid Email !

The provided email is not well-formed.

Download link has expired

The requested archive is no longer available for download from the Software Heritage Vault.

Do you want to cook it again ?

Take a new snapshot of a software origin

If the archived software origin currently browsed is not synchronized with its upstream version (for instance when new commits have been issued), you can explicitly request Software Heritage to take a new snapshot of it.

Use the form below to proceed. Once a request has been submitted and accepted, it will be processed as soon as possible. You can then check its processing state by visiting this dedicated page.

Visit type

Origin url

Processing "take a new snapshot" request ...

Permalinks

To reference or cite the objects present in the Software Heritage archive, permalinks based on SoftWare Hash IDentifiers (SWHIDs) must be used.
Select below a type of object currently browsed in order to display its associated SWHID and permalink.

revision
content
snapshot

swh:1:rev:d76970aa081df7d30b43a22295b02be759aae93c

Add contextual information

Iframe embedding

swh:1:cnt:d615f775bdbaef2ba4f343ffe0225d35b2dee444

Add contextual information

swh:1:snp:70f530b74f5be73cfb71c212c9e3317ce44c1ebc

Add contextual information

Tip revision: d76970aa081df7d30b43a22295b02be759aae93c authored by Steven Johnson on 09 February 2021, 22:32:19 UTC
Fix apps/HelloPyTorch

Tip revision: d76970a

FuseGPUThreadLoops.h

#ifndef HALIDE_FUSE_GPU_THREAD_LOOPS_H
#define HALIDE_FUSE_GPU_THREAD_LOOPS_H

/** \file
 * Defines the lowering pass that fuses and normalizes loops over gpu
 * threads to target CUDA, OpenCL, and Metal.
 */

#include "Expr.h"

namespace Halide {
namespace Internal {

/** Rewrite all GPU loops to have a min of zero. */
Stmt zero_gpu_loop_mins(const Stmt &s);

/** Converts Halide's GPGPU IR to the OpenCL/CUDA/Metal model. Within
 * every loop over gpu block indices, fuse the inner loops over thread
 * indices into a single loop (with predication to turn off
 * threads). Push if conditions between GPU blocks to the innermost GPU threads.
 * Also injects synchronization points as needed, and hoists
 * shared allocations at the block level out into a single shared
 * memory array, and heap allocations into a slice of a global pool
 * allocated outside the kernel. */
Stmt fuse_gpu_thread_loops(Stmt s);

}  // namespace Internal
}  // namespace Halide

#endif

The diff you're trying to view is too large. Only the first 1000 changed files have been loaded.

Showing with 0 additions and 0 deletions (0 / 0 diffs computed)

Computing file changes ...