https://github.com/halide/Halide

sort by:
Revision Author Date Message Commit Date
ed529e0 Align the base when doing strided loads from constant addresses When we codegen something like f[ramp(x + 1, 2, 16)], where f is an internal allocation, we subtract the 1, do the dense load f[ramp(x, 1, 32)] and then take the odd lanes of the result. The reason for this is that it's likely that there's an f[ramp(x, 2, 16)] nearby, and aligning down the x+1 to x means we can share the dense loads and just deinterleave. This PR does the same when there's no x, just an odd constant. This means that cases like f[ramp(64, 2, 16)] + f[ramp(65, 2, 16)] now generate much better assembly. In one case I have it speeds up an entire pipeline by 8%, because aligning the loads in this way causes them to all be promoted off the stack into registers. 29 November 2020, 22:07:28 UTC
bfbfacd Revert formatting of Hexagon intrinsic table (#5484) * Revert formatting of Hexagon intrinsic table * Revert one extra find and replace. 27 November 2020, 20:31:02 UTC
f911a89 Add as_intrinsic helper (#5480) * Add as_intrinsic helper. * Rename calls of known intrinsics. * Fix check_sio. 26 November 2020, 07:40:25 UTC
59bbc4d Simplify intrinsics of broadcasts to broadcasts of intrinsics (#5473) * Simplify intrinsics of broadcasts to broadcasts of intrinsics. * Fix broadcast elementwise simplifications for nested broadcasts. * broadcasted -> broadcast. 25 November 2020, 19:36:02 UTC
3cb2adb Improvements to HalideTraceViz (#5466) - Handle 4D inputs more gracefully - Improve horizontal squishing of long labels 24 November 2020, 21:52:00 UTC
87c9fac Fail CMake when LLVM_LINK_LLVM_DYLIB conflicts with wasm (#5472) * Fail CMake when LLVM_LINK_LLVM_DYLIB conflicts with wasm * Update error message and add comment. 24 November 2020, 00:32:49 UTC
31e9687 Remove AndConditionOverDomain and fix Interval::everything() uses in Bounds (#5455) * rm AndConditionOverDomain and fix Interval::everything() uses in Bounds * fix clang-tidy complaint * rm unnecessary/irrelevant comment * nit: add line break 23 November 2020, 23:01:04 UTC
7447e51 Better codegen for ramps with non-const stride (#5463) 23 November 2020, 18:11:14 UTC
7130069 Fix inconsistency between code & documentation. (#5469) 22 November 2020, 20:54:41 UTC
08825b6 Add optional NAMESPACE arg to `add_halide_library()` (#5467) * Add optional NAMESPACE arg to `add_halide_library()` This is just syntactic sugar for adding the namespace explicitly to the function name, but for code with long namespaces and/or function names this can make for more readable build files. (The Bazel/Blaze build rules offer a similar option and it works well there.) * Update README_cmake.md 22 November 2020, 00:21:25 UTC
6c754cf Change "CMAKE_MODULE_PATH" to "CMAKE_PREFIX_PATH" (#5461) I tried to use instructions for a basic CMake project with a locally downloaded copy of Halide, and got the following error: ``` CMake Error at CMakeLists.txt:9 (find_package): By not providing "FindHalide.cmake" in CMAKE_MODULE_PATH this project has asked CMake to find a package configuration file provided by "Halide", but CMake did not find one. Could not find a package configuration file provided by "Halide" with any of the following names: HalideConfig.cmake halide-config.cmake Add the installation prefix of "Halide" to CMAKE_PREFIX_PATH or set "Halide_DIR" to a directory containing one of the above files. If "Halide" provides a separate development package or SDK, be sure it has been installed. ``` Changing `CMAKE_MODULE_PATH` to `CMAKE_PREFIX_PATH` worked for me. 20 November 2020, 19:53:34 UTC
c510744 Allow HL_EXTRA_OUTPUTS as a way to get extra Generator outputs for debugging (#5457) * Allow HL_EXTRA_OUTPUTS as a way to get extra Generator outputs for debugging * Update Generator.cpp 20 November 2020, 02:09:08 UTC
5137369 Update Android NDK support in apps/ (#5454) * Update Android NDK support in apps/ The Makefile help for apps/ assumed a fairly old version of the Android NDK (~2017). This updates to assume r19 or later: - No need to do `make-standalone-toolchain` anymore - Clang instead of GCC - assume static linking of libc++ is the right default I am limited in my ability to test on-device here (I don't have a device that will let me test HVX easily). apps/blur works fine onmy P2XL after this change, though. Also, drive-by fix to apps/simd_op_check to remove hvx_64 references. 17 November 2020, 18:38:06 UTC
c0ca4ff Revert C++11 usage in hexagon_remote (#5446) * Revert C++11 usage in hexagon_remote Building hexagon_remote with C++11 ends up inserting some unwanted C++11-related symbols (__gxx_personality_v0) that aren't present. This reverts those changes, and modifies HalideRuntime.h so that most of the `__cplusplus` checks are now `__cplusplus >= 201103L` (i.e., C++ convenience features only exist when compiled under C++11 or later). * Appease MSVC idiocy 16 November 2020, 20:34:33 UTC
e3fc746 Fix typo in comment: jit_compile -> compile_jit (#5453) 16 November 2020, 17:34:31 UTC
e5ff1b6 Revert "Push lets near their uses (whenever possible) while CSEing += (#5387)" (#5448) This reverts commit d5e425ee29f165b9de3527a267837093f99e59be. 16 November 2020, 17:33:32 UTC
fca1b44 update readme.md for hexagon (#5444) 12 November 2020, 23:24:18 UTC
c8ab278 add conditions for one-sided bounds to LT/GT, LE/GE, EQ, NEQ (#5438) add conditions for one-sided bounds to LT/GT, LE/GE, EQ, NEQ 11 November 2020, 18:41:34 UTC
eac39c7 is_zero -> is_const_zero (#5436) is_zero -> is_const_zero 11 November 2020, 17:25:31 UTC
d53b9ef Make simd_op_check.h more overridable (#5442) 10 November 2020, 18:11:14 UTC
f524111 FIx for trunk LLVM (#5435) 05 November 2020, 19:55:10 UTC
8b333bc is_zero is a compile-time check if it's the constant zero (#5433) I think for this code we want a runtime check if it's zero or not. 05 November 2020, 17:10:44 UTC
c45124f Use new flag for clang tidy behavior in cmake (#5428) * Use new flag for clang tidy 04 November 2020, 17:35:32 UTC
ece39b6 Upgrade PyBind11 version in CMake to 2.5 (#5427) v2.4.3 can generate a lot of compiler warnings under C++17; v2.5 fixes there. Note 1: I am unsure about the issues with keeping in sync with Ubuntu 20.04; tagging @alexreinking for comments Note 2: the current version of PyBind11 is actually v2.6, but it has many more changes and upgrading looks nontrivial; deliberately holding off on that upgrade for now. 04 November 2020, 00:12:15 UTC
cd3f1d8 Fix various clang-tidy issues (#5426) * Fix various clang-tidy issues For some reason, these are only getting flagged under C++17 builds, but they are legit (minor) issues we want to fix. 03 November 2020, 01:31:02 UTC
69643ce Fix for trunk LLVM (#5425) 02 November 2020, 23:40:01 UTC
025f054 Remove superfluous boundary condition in resize (#5414) * Remove superfluous boundary condition in resize and tweak schedule. ~10-15% faster 30 October 2020, 17:12:05 UTC
6994a15 fix GCC -> GNU in generator expressions (#5419) 30 October 2020, 16:20:38 UTC
acb818b Add missing operator^ overloads in CppVector (Issue #5415) (#5416) 30 October 2020, 16:18:39 UTC
91b88b5 Don't declare round/roundf for multi-threaded MSVC builds (Issue #5403) (#5417) 30 October 2020, 16:18:09 UTC
feb81a2 doxygen wasn't finding the runtime (#5410) 28 October 2020, 21:08:39 UTC
3fd654f loosen preconditions on div by single point in Bounds.cpp (#5407) Loosen preconditions on div by single point for integers 28 October 2020, 21:07:57 UTC
d5e425e Push lets near their uses (whenever possible) while CSEing += (#5387) * Push lets near their uses (whenever possible) while CSEing += The CSE pass tries to do CSE jointly on the index and value of store nodes. This is to stop: f[x] = f[x] + y from turning into f[x] = f[z] + y This is because the two 'x' indices are not CSE'd together. However, one problem with jointly CSEing the store index and vaues is that after CSEing, the pass puts the lets (the values that were CSE'd) before the new store. That is we get, let(t0...) let(t1...) f[t0] = f[t0] + function_of(t1) Suppose however, there was nothing to CSE between the store index and value, that is the index was unchanged after CSE. In that case, moving lets before the store puts the lets too far away from their uses. This is ok except, there are passes like LoopCarry that are beneficial when they are able to see a long continuous block of stores (eg. when unrolling). But now, they'll see a long sequence of LetStmts. Instead, if the index was unchanged after CSE, we should build up the stores as f[x] = let(t0.. in (value)); This way LoopCarry is given a chance to see a series of stores. Handling this in CSE, means the LoopCarry pass need not be complicated. Change-Id: Iae19e1f69a6b38f3224a64b0c4781533e3862970 * In CSE, when we bundle together the store index and store value, create LetStmts only for CSE'd values that are needed by the store index. The rest can be Let expressions around the store value. * Formatting fix for printing the IR after CSE * Be smarter about pushing lets near their uses when CSEing += In a previous patch, in the code that handles CSEing += store operations, we weren't general enough; we only handled the specific case when the store index remained unchanged when CSE'd together with the store value. This patch is more general. Even if the store index changes, only the use-def chains created from the values used by the new store index that end in defs created by CSE are retained as Let stmts around the new store. Others go as Let expressions around the store value. * Fix some formatting issues exposed by clang-format and remove the inclusion of ExprUsesVar.h in CSE.cpp because it is not used anymore. * Use override in GetVarsUsed in CSE.cpp 28 October 2020, 20:19:40 UTC
7896513 Update IntrusivePtr.h 28 October 2020, 00:38:57 UTC
6e0b499 Update IntrusivePtr.h 28 October 2020, 00:38:57 UTC
e96a0e9 Update IntrusivePtr.h 28 October 2020, 00:38:57 UTC
116a3f0 More clang-tidy cleanup 28 October 2020, 00:38:57 UTC
864955e web assembly threads + demo app (#5395) Add webassembly demo app and enable webassembly in the makefile. 26 October 2020, 23:23:46 UTC
deb10c6 Add -fno-threadsafe-statics 26 October 2020, 17:12:50 UTC
f853c89 Upgrade hexagon_remote/Makefile to use C++11 HalideRuntime.h now requires at least C++11 (for C++ files), so ensure that we pass `-std=c++11` for all those when building the remote. 26 October 2020, 17:12:50 UTC
9ea017c Fix transitive dependencies 26 October 2020, 16:47:38 UTC
7142fa7 Add Stage.gpu_lanes to Python bindings. 23 October 2020, 17:19:50 UTC
fef108b Upgrade WABT version to 1.0.19 23 October 2020, 16:56:44 UTC
2d1aee9 Add a test case for multiple argument memoize_tag to demonstrate usage. (#5393) * Add a test case for multiple argument memoize_tag usage to demonstrate how it works. * Fix formatting typo. 22 October 2020, 20:39:39 UTC
475c7a0 Use locally declared type. 22 October 2020, 16:35:54 UTC
4f072bc Fix bounds resulting in vector types. 22 October 2020, 16:35:54 UTC
75077ed Add missing quotes in run-clang-format.sh 21 October 2020, 23:16:52 UTC
0cc8c30 Tickle Buildbots 21 October 2020, 22:34:15 UTC
8d1784f Change NULL -> nullptr enable the modernize-use-nullptr check in clang-tidy and fix all complaints wer 21 October 2020, 22:34:15 UTC
31f1937 Merge pull request #5365 from halide/pdb_remove_hvx_v64 Issue #3925 : Remove hvx_64 21 October 2020, 20:36:54 UTC
d94e7a7 Update CodeGen_Hexagon.cpp 21 October 2020, 16:44:25 UTC
e520503 Merge branch 'master' into pdb_remove_hvx_v64 21 October 2020, 16:33:44 UTC
fc959e7 Merge pull request #5382 from halide/srj/readability Enable the useful readability-* checks in clang-tidy 21 October 2020, 16:26:49 UTC
ce2f41d Merge pull request #5384 from dragly/dragly/python-negate-operator Add `logical_not` function for Python 21 October 2020, 16:25:54 UTC
235abe4 Tickle Buildbots 21 October 2020, 16:24:22 UTC
acbc69a Enable modernize-use-equals-default/delete in clang-tidy 21 October 2020, 16:24:22 UTC
61792d8 Add logical_not function for Python This change introduces `logical_not` as a free function and member function that calls `operator!`. The reason why a new function is added is because there is no `operator!` in Python and the `not` keyword cannot be overloaded. Hence, there was currently no way to call the C++ `operator!` in Python. 21 October 2020, 08:35:48 UTC
e2820e2 Enable the useful readability-* checks in clang-tidy 20 October 2020, 21:11:46 UTC
b2c9769 Merge branch 'master' into pdb_remove_hvx_v64 20 October 2020, 20:57:09 UTC
00f50a1 Merge pull request #5379 from halide/srj/mod2 Enable clang-tidy's modernize-use-default-member-init check 20 October 2020, 20:18:09 UTC
c2ed326 Enable clang-tidy's modernize-use-default-member-init check 20 October 2020, 20:17:53 UTC
c2c35b3 remove hvx_64 from Halide/Makefile 20 October 2020, 20:08:13 UTC
b5db7fd Merge pull request #5381 from halide/srj/perfchecks Enable interesting performance-* clang-tidy checks 20 October 2020, 18:53:44 UTC
83d52ab Enable interesting performance-* clang-tidy checks 20 October 2020, 18:44:03 UTC
8221d6c Merge pull request #5378 from halide/srj/misc Enable the interesting misc-* clang-tidy checks 20 October 2020, 18:39:01 UTC
8f3ecb4 Enable the interesting misc-* clang-tidy checks 20 October 2020, 18:38:45 UTC
1e8505e Merge pull request #5377 from halide/srj/modernize Enable clang-tidy's modernize-deprecated-headers check and apply fixes. 20 October 2020, 18:25:01 UTC
a3ef417 Enable clang-tidy's modernize-deprecated-headers check and apply fixes. 20 October 2020, 18:18:15 UTC
5e91d6f clang-format 20 October 2020, 16:26:26 UTC
7fdd42c clang-format 20 October 2020, 16:26:14 UTC
00ae979 Merge branch 'master' into pdb_remove_hvx_v64 20 October 2020, 16:16:20 UTC
a2934d4 Update d3d12compute.cpp 20 October 2020, 16:03:17 UTC
f7e77e2 Extend clang-tidy checks to src/runtime (and fix resulting errors) 20 October 2020, 16:03:17 UTC
a9e3941 Merge pull request #5372 from halide/simplify-vectorreduce Add simplification rules for vectorreduce of broadcasts 20 October 2020, 06:45:49 UTC
0ca44db Merge pull request #5358 from halide/srj/tidy-all Extend clang-tidy checks into tools, utils, and python_bindings 19 October 2020, 22:12:09 UTC
85f143c Address review comments 19 October 2020, 21:27:54 UTC
14dd26a Extend clang-tidy checks into tools, utils, and python_bindings 19 October 2020, 20:58:12 UTC
af57921 Drop support for LLVM9 (#5121) Drop support for LLVM9 19 October 2020, 20:21:55 UTC
7a68888 Makefile tweaks to work on ubuntu 19 October 2020, 19:58:25 UTC
99f01a8 Update Makefile 19 October 2020, 19:58:25 UTC
bc615b0 Update run-clang-format.sh 19 October 2020, 19:58:25 UTC
a9975c8 Update Makefile 19 October 2020, 19:58:25 UTC
dc5e171 Update Makefile 19 October 2020, 19:58:25 UTC
f3c47ae Fix LLVM_DIR value 19 October 2020, 19:58:25 UTC
6a8e292 Move clang-tidy logic into script 19 October 2020, 19:58:25 UTC
1a17dbe Move the clang-format logic into a shell script This puts the truth for our clang-format logic into a shell script rather than the Makefile, in hopes of making it slightly easier for CMake users to use. 19 October 2020, 19:58:25 UTC
d8dac07 Merge pull request #5370 from halide/likely-if Make loop partitioning a bit more robust for if statements 19 October 2020, 19:56:01 UTC
d049a83 Merge branch 'master' of https://github.com/halide/Halide into simplify-vectorreduce 19 October 2020, 18:40:50 UTC
8e3262b OpenCL Texture Support (#5297) Add OpenCL Texture Support (https://github.com/halide/Halide/pull/5297) 19 October 2020, 17:04:13 UTC
e93f81a Use has_uncaptured_likely_tag instead. 19 October 2020, 06:10:21 UTC
e164866 Add simplifications for vectorreduce of broadcasts. 19 October 2020, 06:03:49 UTC
a1d0201 Fix likely for if when the likely is not the outermost expression. 17 October 2020, 05:38:12 UTC
2cde234 prefer using HVX over HVX_128 16 October 2020, 19:43:57 UTC
5286a68 Fix wasm-related glitches in our timing/benchmarking code 16 October 2020, 17:40:04 UTC
16604ae Set vector_size to 128. rule out vector sizes that made sense on HVX_64 now that HVX_128 is the only mode for HVX 16 October 2020, 00:07:35 UTC
8151b77 Check only for Target::HVX 16 October 2020, 00:04:47 UTC
04cd8dd Remove hvx_64 and hvx to python bindings 15 October 2020, 23:17:17 UTC
4d1a4bb Fix bad merge of test/correctness/mul_div_mod.cpp 15 October 2020, 21:42:15 UTC
13a4eb1 Merge branch 'master' into pdb_remove_hvx_v64 15 October 2020, 21:12:26 UTC
09f9eda [camera_pipe] - In hvx_128 we need 4 threads to saturate hvx with work 15 October 2020, 21:08:34 UTC
back to top