swh:1:snp:70f530b74f5be73cfb71c212c9e3317ce44c1ebc

sort by:
Revision Author Date Message Commit Date
6b32fa2 Merge branch 'abadams/1v3_linear_comparison_cancellations' into abadams/diagnose_boundary_condition_failure 24 June 2020, 01:42:10 UTC
f95386b Test three hypotheses 1) llvm loop opts are messing things up 2) The auto-benchmarking code is running amok 3) We're rejitting every iteration 24 June 2020, 00:51:13 UTC
b44d1de Add debugging spew to help figure out why test is failing on buildbots 24 June 2020, 00:51:13 UTC
e3107d7 Add some missing 1 vs 3 linear comparison cancellations Somehow we were missing these. They're useful in canceling non-linear terms from both sides of a comparison. Pretty trivial, but formally verified anyway to protect us from typos. 23 June 2020, 20:59:36 UTC
69e320e Merge pull request #5061 from halide/shoaibkamil/metal_is_nan Add is_nan_f32 for metal 23 June 2020, 18:49:04 UTC
c0870ff Merge pull request #5065 from halide/abadams/better_error_message_when_no_distrib better error message if you try to build an app before libHalide 23 June 2020, 18:19:37 UTC
d716ea1 Merge remote-tracking branch 'origin/master' into shoaibkamil/metal_is_nan 23 June 2020, 18:08:04 UTC
3d45335 Remove debugging print 23 June 2020, 18:07:11 UTC
5ebe589 Merge pull request #5041 from halide/abadams/trim_no_ops_lift_loop_invariant_if_statements Add an explicit pass to lift loop invariant if statements 23 June 2020, 18:06:24 UTC
1bda178 Try to trigger buildbots 23 June 2020, 17:45:31 UTC
31546bd Add is_inf_f32/is_nan_f32/is_finite_f32 for D3D12Compute 23 June 2020, 17:24:08 UTC
7a156c4 clang-format 23 June 2020, 14:00:10 UTC
be6ed6e Add GPU version of test 23 June 2020, 13:57:33 UTC
2c83da3 Give a better error message if you try to build an app before building Halide Fixes #5060 22 June 2020, 21:09:46 UTC
4bb7897 Add is_inf and is_finite as well 22 June 2020, 19:08:45 UTC
2748848 Add is_nan_f32 for metal. 22 June 2020, 18:52:56 UTC
8521896 Merge pull request #5049 from halide/abadams/fix_rval_reference_typo Fix #5046 21 June 2020, 03:32:16 UTC
17eb851 Merge branch 'master' into abadams/trim_no_ops_lift_loop_invariant_if_statements 20 June 2020, 22:43:01 UTC
a59107f Merge branch 'abadams/trim_no_ops_lift_loop_invariant_if_statements' of https://github.com/halide/Halide into abadams/trim_no_ops_lift_loop_invariant_if_statements 20 June 2020, 22:42:52 UTC
9279fa5 Merge branch 'master' into abadams/fix_rval_reference_typo 20 June 2020, 22:42:28 UTC
24d7e97 Merge pull request #5057 from halide/srj-sig Minor JITExtern (& related) cleanups 20 June 2020, 22:41:07 UTC
23bc0dc Fixes 19 June 2020, 23:47:01 UTC
d70c6db Merge pull request #5036 from halide/abadams/store_in_register_with_no_lanes_loop Constant extents inferred pre-storage flattening 19 June 2020, 23:42:30 UTC
a98e04e Constant extents need to be inferred pre storage flattening Consider an allocation that has a dynamic extent, but needs to have a constant extent because it's stored in MemoryType::Register (e.g. see the test). We take an upper bound in these cases to get a constant allocation size. This PR changes things to take that upper bound *before* storage flattening instead of after. This way the individual per-dimension extents are all constant, instead of just their product. If you do it after storage flattening then you get dynamic strides within a constant-sized allocation, which is silly and not compatible with hoisting values into registers anyway (because access is at non-constant coords). Also fixed the assumption that MemoryType::Register on the GPU means that there must be a GPULanes loop. 19 June 2020, 23:42:06 UTC
47d8c30 Minor JITExtern (& related) cleanups - make all single-arg ctors explicit, and add one missing explicit usage - add an operator<< to ExternSignature to make debugging related issues easier 19 June 2020, 23:28:19 UTC
a31c39e Add an explicit pass to lift loop invariant if statements If statements can be injected by GuardWithIf, RDom predicates, specializations, and uses of undef. There are various situations where an if statement can end up further inside a loop nest than strictly necessary. This PR adds a pass to hoist them. This results in slightly better codegen for some conv layer schedules on GPU. Also reduced the expr count in lots_of_loop_invariants because it spends a long time inside LLVM 19 June 2020, 23:17:08 UTC
57e0b94 Touch 19 June 2020, 17:17:09 UTC
5f91893 Touch 19 June 2020, 17:16:33 UTC
54302b6 Merge branch 'master' into abadams/fix_rval_reference_typo 19 June 2020, 02:09:53 UTC
b668cd8 Merge branch 'master' into abadams/trim_no_ops_lift_loop_invariant_if_statements 19 June 2020, 02:08:46 UTC
ed40000 Merge branch 'abadams/trim_no_ops_lift_loop_invariant_if_statements' of https://github.com/halide/Halide into abadams/trim_no_ops_lift_loop_invariant_if_statements 19 June 2020, 02:08:44 UTC
c53c7e8 Merge pull request #5054 from halide/srj-cublas Skip cublas on Windows (Issue #5053) 19 June 2020, 02:07:35 UTC
07834a5 Skip cublas on Windows (Issue #5053) 18 June 2020, 22:25:34 UTC
5534e3f Add an explicit pass to lift loop invariant if statements If statements can be injected by GuardWithIf, RDom predicates, specializations, and uses of undef. There are various situations where an if statement can end up further inside a loop nest than strictly necessary. This PR adds a pass to hoist them. This results in slightly better codegen for some conv layer schedules on GPU. Also reduced the expr count in lots_of_loop_invariants because it spends a long time inside LLVM 17 June 2020, 17:49:05 UTC
a308308 Fix #5046 17 June 2020, 17:32:20 UTC
4fc3606 Merge branch 'master' into abadams/trim_no_ops_lift_loop_invariant_if_statements 17 June 2020, 16:05:12 UTC
d7c99db Merge pull request #5045 from halide/srj-llvmfixer Fix for trunk LLVM API changes 17 June 2020, 16:04:41 UTC
02552dd Merge pull request #5042 from halide/shoaibkamil/arm64_windows Add preliminary AOT Windows ARM64 support 17 June 2020, 14:24:09 UTC
23fa7d0 Update Makefile 17 June 2020, 05:13:34 UTC
705d6e4 Fix for trunk LLVM API changes 17 June 2020, 00:37:13 UTC
347608d Add an explicit pass to lift loop invariant if statements If statements can be injected by GuardWithIf, RDom predicates, specializations, and uses of undef. There are various situations where an if statement can end up further inside a loop nest than strictly necessary. This PR adds a pass to hoist them. This results in slightly better codegen for some conv layer schedules on GPU. Also reduced the expr count in lots_of_loop_invariants because it spends a long time inside LLVM 16 June 2020, 20:42:36 UTC
61d0060 clang-format 16 June 2020, 20:31:31 UTC
b761bfe Add issue 16 June 2020, 20:30:18 UTC
340246a Merge remote-tracking branch 'origin/master' into shoaibkamil/arm64_windows 16 June 2020, 18:29:03 UTC
c7098f8 Merge pull request #5035 from halide/abadams/improve_cuda_mat_mul It's worth cancelling correlated subexpressions in load/store indices 16 June 2020, 16:25:50 UTC
19ef844 Merge pull request #5037 from halide/abadams/openglcompute_loop_invariants Put buffers before other uniforms in gl uniform list 16 June 2020, 16:25:32 UTC
b9fa8bf Merge pull request #5038 from halide/srj-copyto Clarify debug logging in copy_to_device() 16 June 2020, 16:25:03 UTC
f147d7b Improve comment on simplify correlated differences 16 June 2020, 00:20:45 UTC
75aa213 It's worth cancelling correlated subexpressions in load/store indices In particular, this makes warp shuffles much more reliable, because any dependence of a load or store index on the block id is more likely to get cancelled out. This PR massively simplifies the generated code for cuda_mat_mul, and makes it about 30% faster (although it's still mysteriously 2x slower than cublas on my card). Also reduces the amount of IR in some other apps very slightly. Doesn't seem to affect compile times. 16 June 2020, 00:20:45 UTC
70b3b75 Clarify debug logging in copy_to_device() We currently always call copy_to_device() on buffers that need to be on device (with the understanding that it's a no-op if no copy is needed); if the `debug` feature is on, a naive reading might make someone think that needless copy-to-device operations are actually happening. This adds a bit of logging (debug mode only) to make it clearer whether the copy to device actually happened, or if it was skipped because host was not dirty. 15 June 2020, 23:11:26 UTC
a6634b6 Add extra comment about why buffers come first 15 June 2020, 22:35:50 UTC
42f66da Put buffers before other uniforms in gl uniform list buffer ids are constrained to be smaller than arbitrary scalar uniforms, so they should go first in the closure. Also added a stress-test for lifting out lots of loop invariants, and disabled LICM completely for GLSL, because it uses magic names (.varying) for some things. 15 June 2020, 22:30:37 UTC
45e35d1 Not a function call 15 June 2020, 17:04:34 UTC
638ac11 Merge pull request #5033 from halide/srj-tsan-fix Fix broken TSAN code 15 June 2020, 16:49:06 UTC
edda3c2 Merge remote-tracking branch 'origin/master' into shoaibkamil/arm64_windows 15 June 2020, 16:26:59 UTC
64467ba Merge branch 'master' into shoaibkamil/arm64_windows 15 June 2020, 16:26:49 UTC
d7d1dac Merge pull request #5025 from halide/shoaibkamil/correct_memory_fences Make gpu_thread_barrier() semantics consistent 15 June 2020, 16:26:30 UTC
2011720 Merge pull request #5032 from halide/abadams/atomic_vectorization_tweaks atomic vectorization tweaks 13 June 2020, 02:00:52 UTC
cacda0e Merge remote-tracking branch 'origin/abadams/fix_cuda_mat_mul_assert' into shoaibkamil/correct_memory_fences 12 June 2020, 21:05:45 UTC
55dac45 Fix inverted assert 12 June 2020, 20:20:32 UTC
79c3873 Fix broken TSAN code Update for a recent LLVM change was incorrect; it compiled but didn't actually work properly. (We should probably run sanitizers on the buildbots...) 12 June 2020, 19:25:58 UTC
1328084 Merge remote-tracking branch 'origin/master' into shoaibkamil/correct_memory_fences 12 June 2020, 19:04:50 UTC
bbe4acf Merge pull request #5030 from halide/abadams/licm_on_innermost_loop_bodies_too Don't lift constant integer offsets 12 June 2020, 17:50:41 UTC
0bc3070 Merge remote-tracking branch 'origin/master' into abadams/atomic_vectorization_tweaks 12 June 2020, 17:49:01 UTC
d9795ee Merge pull request #5031 from halide/srj-comdat Fix some "MachO doesn't support COMDAT" issues in runtime 12 June 2020, 17:46:59 UTC
a6ec01a Try to work around MSL compiler stupidity. 12 June 2020, 16:42:50 UTC
195bcbc Merge remote-tracking branch 'origin/master' into shoaibkamil/correct_memory_fences 12 June 2020, 16:19:49 UTC
887eacc Merge pull request #5029 from halide/abadams/fix_associativity Make it harder for the associativity test to get confused 12 June 2020, 05:14:28 UTC
7caecf2 Pass LLVM_VERSION to tests 11 June 2020, 22:43:35 UTC
471e882 More verbose error in CSE 11 June 2020, 22:43:35 UTC
67bbbd3 Add comment to AddAtomicMutex 11 June 2020, 22:43:35 UTC
8ca7602 Add 16-bit float to associative ops table 11 June 2020, 22:43:35 UTC
f681742 Simplify some code in deinterleave 11 June 2020, 22:43:35 UTC
d0584b3 Permit fusing pure and impure rvars 11 June 2020, 22:43:35 UTC
394536f Make lossless_cast more aggressive 11 June 2020, 22:43:24 UTC
cec8290 Fix some "MachO doesn't support COMDAT" issues in runtime Runtime code that will be instantiated for OSX/iOS needs to ensure that there are no plain 'inline' functions -- they must be either WEAK or __attribute__((always_inline)) -- otherwise, some compiler configurations can produce the error above. (Note that this also applies to member functions that are defined inline, even without an explicit 'inline' keyword). (Note also that the vagaries of C++ mean that declaring a ctor implies that a dtor will be auto-created; in some of these we must explicitly declare the dtor so that it too is always-inlined, even if it is empty...) 11 June 2020, 22:20:39 UTC
979d701 Don't lift constant integer offsets 11 June 2020, 21:44:58 UTC
d755381 Stop copying strings 11 June 2020, 20:59:01 UTC
1483793 Address review comments, make algorithm simpler. 11 June 2020, 20:48:45 UTC
4e55416 Make it harder for the associativity test to get confused 11 June 2020, 19:29:38 UTC
8cddb2e Merge pull request #5027 from halide/srj-absd Fix codegen for absd() in GLSLBase 11 June 2020, 17:14:55 UTC
3d19643 Merge pull request #5026 from halide/srj-glsl Combine visit(Cast) for GLSL and OpenGLCompute 11 June 2020, 17:14:42 UTC
665001d Partially address reviewer comments 11 June 2020, 15:59:37 UTC
3721fcb Merge pull request #5028 from halide/srj-appv Enable verbosity in apps builds 11 June 2020, 00:50:22 UTC
53e8ab6 Also add --output-on-failure 11 June 2020, 00:31:56 UTC
9e780af Enable verbosity in apps builds Hoping this will help us track down flaky Windows failures. 11 June 2020, 00:22:27 UTC
5c52122 Merge pull request #5023 from halide/abadams/more_simplifier_rules New simplifier rules necessary for the gpu autoscheduler 11 June 2020, 00:20:59 UTC
4a2216d Fix codegen for absd() in GLSLBase It was emitting as a float, which is *never* correct, since absd() is only used for int or uint types. (This happened to work before because GLSL was previously also incorrectly using float for uint in some cases.) Also did a drive-by removal of code in Codegen_C that recapitulated the logic from IROperator.cpp; maybe the type field of absd() was incorrect at some point in the past, but this calculation seems redundant and wrong now. 10 June 2020, 22:25:06 UTC
ab1c53e Combine visit(Cast) for GLSL and OpenGLCompute These are the only two overrides of `visit(Cast) from GLSLBase and they both have identical implementations; combine them into one and move into GLSLBase to save code. 10 June 2020, 22:15:08 UTC
17f0176 clang-format 10 June 2020, 20:22:29 UTC
724cd28 clang-format 10 June 2020, 20:19:25 UTC
d7225ae Tweak spacing 10 June 2020, 20:18:07 UTC
5f0ce89 Merge remote-tracking branch 'origin/master' into shoaibkamil/correct_memory_fences 10 June 2020, 20:11:55 UTC
75fe44a Slight change in D3D12 logic. 10 June 2020, 20:09:09 UTC
9cec5a5 New simplifier rules necessary for the gpu autoscheduler 10 June 2020, 17:20:36 UTC
8b9081b Merge pull request #5021 from halide/abadams/fewer_print_parentheses Fewer print parentheses 10 June 2020, 16:39:23 UTC
27b478e Minor 10 June 2020, 16:34:40 UTC
ca420a4 Checkpoint 10 June 2020, 15:36:34 UTC
ee5f90e Merge pull request #5015 from acolinisi/PR--cmake-llvm-dynlib-2 cmake: llvm: fix linking against LLVM shared lib 10 June 2020, 06:49:36 UTC
3609c63 Merge pull request #5022 from halide/wording_fix Small wording improvements. 10 June 2020, 06:36:02 UTC
back to top