swh:1:snp:2c68c8bd649bf1bd2cf3bf7bd4f98d247b82b5dc

sort by:
Revision Author Date Message Commit Date
6f5dc6b Ensure that QuantizationInfo.dimension is initialized Currently, it can contain garbage after parsing 07 December 2020, 19:27:17 UTC
4f554a7 Change #include path style to disambiguate headers (#5529) 07 December 2020, 19:21:35 UTC
8e7c992 Merge branch 'master' into interpret_nn 07 December 2020, 18:53:46 UTC
7f70907 Combine align and slice for the small vectors in align_loads (#5497) * Combine align and slice for the small vectors in align_loads * Fix format 07 December 2020, 17:17:14 UTC
1800dc2 Simplify a slice of slice (#5495) * Simplify a slice of slice * Fix format * Simplify for slice of concats + tests * format * format * New line to improve readability Co-authored-by: Steven Johnson <srj@google.com> 07 December 2020, 01:45:39 UTC
bd53b47 Allow creation of IntImm/UIntImm with any number of bits up to 64 (#5441) * Allow creation of IntImm/UIntImm with any number of bits up to 64 * Changes: - check that the number of bits is >= 1 - modify upgrade_* functions - allow printing of type with arbitrary number of bits. * Fix format * next_power_of_two which will end Co-authored-by: Steven Johnson <srj@google.com> 06 December 2020, 20:39:26 UTC
7ea09cd Point fft JIT tests to Halide binary (#5521) 06 December 2020, 07:38:56 UTC
d325e13 Add simd_op_check tests and a few more patterns (#5519) * Add simd_op_check coverage of some ARM ops we generate. * Remove local filter option. * Fix expected patterns for arm32. 04 December 2020, 16:38:05 UTC
c1885fc Fixes to bounds inference on shift_left (#5477) * Add shift_left fix for signed integers by possibly negative values + regression test * add required condition on shift_left integer fix * add type check to shift_left minimum condition * fix constant folding of shifts with |b| >= type.bits() for types that allow overflow (failes correctness/simplify test) * make regression tests use scoped bindings * change condition in case int24/int48 proposal happens soon * revert changes based on overflow expectations * add more regression tests * clarify comment * add shift_left min handler for b only UB * fix clang-tidy complaint * relax shift_left of non-negative value constraint * pull case outside of unnecessary preconditions * fix clang-format complaint * fix broken precondition * add typecheck to possibly save a can_prove() call * add easy-out type check to precondition * Add descriptive comment to bug fix + add another early-exit precondition Co-authored-by: Steven Johnson <srj@google.com> 04 December 2020, 00:14:07 UTC
28f9aef Enable commented clang-format option. (#5520) 03 December 2020, 22:05:21 UTC
927edeb Merge branch 'master' into interpret_nn 03 December 2020, 18:31:24 UTC
759b241 Add version-checking to the clang-tidy and clang-format scripts (#5513) Using the 'wrong' version of the tools will produce results out of sync with our presubmit tests, so add checking to ensure the user has their env set up correctly. 03 December 2020, 18:04:00 UTC
2ddd0b0 Revert "Make context handling in GPU runtimes more consistent and robust. (#5474)" (#5515) This reverts commit f47c5c99deac86c6d1f16cfcb1743a0e9e79317d. 03 December 2020, 02:10:58 UTC
2c8e3ea Revert "Fix broken destroy_context() in gpu_multi_context_threaded_aottest.cpp (#5512)" (#5514) This reverts commit 445ed5ee5ba5e23efaabe0b8d6971c0678b5a569. 03 December 2020, 02:08:31 UTC
445ed5e Fix broken destroy_context() in gpu_multi_context_threaded_aottest.cpp (#5512) 03 December 2020, 00:35:48 UTC
a34d00d Adding CMake build for FFT (#5508) * Add fft build * Fix properties * Fix generator argument * Add "Success!" message to fft aot test. * Formatting. * Fix target directory for bench_fft 02 December 2020, 22:44:43 UTC
f47c5c9 Make context handling in GPU runtimes more consistent and robust. (#5474) This PR adds a consistent GPU compiled kernel cache across the Cuda, Direct3D, OpenCL, and Metal runtimes. This cache is robust for kernels being used across multiple contexts and threads as well as using common code via a template. OpenGL and OpenGLCompute are not addressed due to issues in their implementation. There should be no regressions for those runtimes however. Adds tests for many GPU kernels and kernels across contexts and threads. Fixes a bug in CUDA runtime where some error message text in cuda_do_multidimensional_copy was not initialized. Fixes a bug in CUDA runtime where device release code did not run if CUDA libraries are directly linked into the executable. (This would have caused crashes due to the device allocation caching among other issues.) 02 December 2020, 22:40:21 UTC
073b8e4 Add CMake presets for 3.19+ users (#5506) * add CMakePresets.json and update docs * fix Windows presets * remove NDEBUG from GCC options * fix typo in README 02 December 2020, 22:19:34 UTC
1c0f824 Restructure apps to be fully external. (#5507) * Restructure apps to be fully external. * drive-by fix default Halide_TARGET * patch up fused apps build * remove doubled line * fixing multiple import for 3.16 * fix naming convention * Add missing #include <cstdio> 02 December 2020, 22:15:23 UTC
329a405 Enable constant folding of broadcasted constants (#5500) * Enable constant folding of broadcasted constants. * Make some scalar constant folding tests vectors. * Remove excessive simplify calls causing infinite recursion. Co-authored-by: Steven Johnson <srj@google.com> 02 December 2020, 18:29:08 UTC
ce684c6 Merge branch 'master' into interpret_nn 01 December 2020, 21:55:19 UTC
6cc24bb Fix compile time regression in fft (#5494) * Use equal instead of can_prove equality when examining enclosing scope There can be a lot of things in there, and can_prove is expensive. * Speed up bounds_of_inner_var By only expanding enclosing let stmts if the variable is actually used in the result, and by finding the last usage and then skipping anything earlier (skipping over nested producer nodes) Co-authored-by: Steven Johnson <srj@google.com> 01 December 2020, 20:49:09 UTC
6af4361 Fixes for trunk LLVM (#5499) 01 December 2020, 16:58:13 UTC
44c9a72 Reduce size of test image (#5496) 01 December 2020, 04:32:46 UTC
1ad6fb8 Fix case where simplifying interleaves might need a slice of the original vector (#5492) * Replace is_negative_negatable_const and associated cruft with lossless_negate. * Don't assume an interleave consumes all of the vectors it is shuffled from. * Add test of slices of interleaves. * Fix formatting * Rephrase logic. 01 December 2020, 04:31:39 UTC
491791d Simplify signed shifts more strongly (#5491) * Simplify signed shifts more strongly. * Simplify after negating b. * Also mutate other possibly simplifying cast. 01 December 2020, 04:31:00 UTC
edfc98b Restructure interpret_nn (#5498) * Restructure interpret_nn - Shuffle stuff into subdirs, remove some dead files, do some Makefile cleanup * Move tflite_parser -> tflite/ 01 December 2020, 00:52:36 UTC
7df01a5 Track TFLite 2.4, not master To simplify ongoing upkeep, let's have apps/interpret_nn track the TF 2.4 release (which is at 2.4.0-rc3 right now), rather than master. (This means removing the support for uint64 types in our TFLite-adjacent code, which was added to master post-2.4) 30 November 2020, 23:08:46 UTC
960f857 Fix All value from the ValType table (#5493) 30 November 2020, 22:58:10 UTC
682b771 Merge branch 'master' into interpret_nn 30 November 2020, 22:50:58 UTC
21afdc4 Align the base when doing strided loads from constant addresses (#5489) When we codegen something like f[ramp(x + 1, 2, 16)], where f is an internal allocation, we subtract the 1, do the dense load f[ramp(x, 1, 32)] and then take the odd lanes of the result. The reason for this is that it's likely that there's an f[ramp(x, 2, 16)] nearby, and aligning down the x+1 to x means we can share the dense loads and just deinterleave. This PR does the same when there's no x, just an odd constant. This means that cases like f[ramp(64, 2, 16)] + f[ramp(65, 2, 16)] now generate much better assembly. In one case I have it speeds up an entire pipeline by 8%, because aligning the loads in this way causes them to all be promoted off the stack into registers. 30 November 2020, 21:14:56 UTC
226b12c Improve speed of testing apps/ (#5482) * Improve speed of testing apps/ - Skip all app tests that are labeled as 'benchmarks' - Specify `--build-noclean` to avoid unnecessary full rebuilds * Change label 'benchmark' -> 'slow_tests' 30 November 2020, 19:12:07 UTC
16929df Add Type::widen and Type::narrow helpers. (#5478) * Add Type::widen and Type::narrow helpers. * widen -> wide, more uses of wide. * wide back to widen. Co-authored-by: Dillon Sharlet <dsharlet@gmail.com> 30 November 2020, 18:27:56 UTC
78489d0 Small cleanups/fixes (#5479) * Small cleanups/fixes peeled from lower-patterns2. * Fix derp * Fix possibly undefined evaluation order. * Smaller code. * Work around test issue. 30 November 2020, 16:15:16 UTC
49ca720 Replace is_negative_negatable_const and more logic with lossless_negate (#5490) * Replace is_negative_negatable_const and associated cruft with lossless_negate. * Add comment 30 November 2020, 15:43:18 UTC
bfbfacd Revert formatting of Hexagon intrinsic table (#5484) * Revert formatting of Hexagon intrinsic table * Revert one extra find and replace. 27 November 2020, 20:31:02 UTC
f911a89 Add as_intrinsic helper (#5480) * Add as_intrinsic helper. * Rename calls of known intrinsics. * Fix check_sio. 26 November 2020, 07:40:25 UTC
c9d7806 Add quantize_test 25 November 2020, 20:18:22 UTC
59bbc4d Simplify intrinsics of broadcasts to broadcasts of intrinsics (#5473) * Simplify intrinsics of broadcasts to broadcasts of intrinsics. * Fix broadcast elementwise simplifications for nested broadcasts. * broadcasted -> broadcast. 25 November 2020, 19:36:02 UTC
2ee4828 Add reshape_test 25 November 2020, 02:01:34 UTC
771e1ea Update buffer_util.h 25 November 2020, 01:13:08 UTC
27fa4b4 Fix bonehead mistake 25 November 2020, 01:05:34 UTC
92bcf19 Add pad_test 25 November 2020, 00:55:37 UTC
726ab95 Fix dopey code 25 November 2020, 00:37:24 UTC
b311b84 Add concatenation_test Also, drive-by fix to 'axis' parsing 25 November 2020, 00:19:18 UTC
073542e Reverse order of tensor axes in our tests 25 November 2020, 00:05:57 UTC
eebcd69 Add max_pool_test 24 November 2020, 23:18:33 UTC
84244eb Add stub test for FullyConnectedOp 24 November 2020, 23:09:34 UTC
2758853 Make more types amenable to use with CHECK 24 November 2020, 22:49:12 UTC
4c87186 Update conv2d_test.cpp 24 November 2020, 22:19:15 UTC
f5f1b20 Revert changes in Makefile.inc 24 November 2020, 22:04:30 UTC
9fbbfa4 Revert "Revert changes in Makefile.inc" This reverts commit cbff3c369ff2f491c52e994009c3e33060bc1ae1. 24 November 2020, 22:03:23 UTC
cbff3c3 Revert changes in Makefile.inc 24 November 2020, 21:54:15 UTC
a6057f4 Merge branch 'master' into interpret_nn 24 November 2020, 21:53:29 UTC
3cb2adb Improvements to HalideTraceViz (#5466) - Handle 4D inputs more gracefully - Improve horizontal squishing of long labels 24 November 2020, 21:52:00 UTC
5a7ad6e Update CompareBuffers + tests. Add smarts to allow for a small percentage of off-by-one results without considering a mismatch. Simplify the Conv and DepthwiseConv reference tests to use float. 24 November 2020, 20:10:27 UTC
f9e9e64 Merge branch 'master' into interpret_nn 24 November 2020, 20:07:16 UTC
694a409 run-clang-format.sh 24 November 2020, 01:31:48 UTC
ca60f39 Add Conv2D & DepthwiseConv2D tests This involved some expanding of the op_test_helper code so the existing tests got tweaked some too. Deleted the no-longer-needed convolution_test.cc. 24 November 2020, 01:29:14 UTC
87c9fac Fail CMake when LLVM_LINK_LLVM_DYLIB conflicts with wasm (#5472) * Fail CMake when LLVM_LINK_LLVM_DYLIB conflicts with wasm * Update error message and add comment. 24 November 2020, 00:32:49 UTC
31e9687 Remove AndConditionOverDomain and fix Interval::everything() uses in Bounds (#5455) * rm AndConditionOverDomain and fix Interval::everything() uses in Bounds * fix clang-tidy complaint * rm unnecessary/irrelevant comment * nit: add line break 23 November 2020, 23:01:04 UTC
c7855d1 Merge branch 'master' into interpret_nn 23 November 2020, 18:45:53 UTC
7447e51 Better codegen for ramps with non-const stride (#5463) 23 November 2020, 18:11:14 UTC
7130069 Fix inconsistency between code & documentation. (#5469) 22 November 2020, 20:54:41 UTC
3103bb6 Add more constraints. 22 November 2020, 19:57:53 UTC
ccafa0c Fix Makefile to respect HL_TARGET 22 November 2020, 19:15:55 UTC
e559ff6 Pad with leading zeros to alphabetize layers. 22 November 2020, 19:11:22 UTC
e7906fd Small optimizations for ARM. 22 November 2020, 05:42:12 UTC
e814a3a Narrow to 16-bits before adding offset. 22 November 2020, 05:27:48 UTC
84874ff Small optimizations. 22 November 2020, 03:20:12 UTC
28f4c0d Add halide_profiler_report call to benchmark. 22 November 2020, 01:42:23 UTC
08825b6 Add optional NAMESPACE arg to `add_halide_library()` (#5467) * Add optional NAMESPACE arg to `add_halide_library()` This is just syntactic sugar for adding the namespace explicitly to the function name, but for code with long namespaces and/or function names this can make for more readable build files. (The Bazel/Blaze build rules offer a similar option and it works well there.) * Update README_cmake.md 22 November 2020, 00:21:25 UTC
8797e8a Tweak depthwise conv schedule. 21 November 2020, 23:07:36 UTC
70bacc3 Add fully connected op 21 November 2020, 22:40:52 UTC
32be40f Add fully connected generator. 21 November 2020, 22:19:43 UTC
b27a253 Enable add of other than 4 dimensions. 21 November 2020, 08:24:20 UTC
e5f979a Hack Int8 buffers to UInt8. 21 November 2020, 08:23:58 UTC
652d729 Optimize conv mainly on ARM 21 November 2020, 06:07:13 UTC
20986e1 Style 21 November 2020, 06:06:22 UTC
b919233 Add failures when types aren't supported. 21 November 2020, 02:09:22 UTC
2f53446 Merge branch 'interpret_nn' of https://github.com/halide/Halide into interpret_nn 21 November 2020, 01:25:48 UTC
ea85d04 Update convolution_test.cpp 21 November 2020, 01:15:45 UTC
e0bb2a5 Update benchmark.cpp 21 November 2020, 01:15:14 UTC
beb0cb4 Update ops.cpp 21 November 2020, 01:14:27 UTC
665acd4 Merge branch 'interpret_nn' of https://github.com/halide/Halide into interpret_nn 21 November 2020, 01:13:32 UTC
9410577 app_util.h -> error_util.h The only remaining stuff was the CHECK/etc support, so renamed it to avoid it being a weird grab-bag-of-stuff for now. Moved into `internal_nn` namespace, like everything else. Also, APP_CHECK -> CHECK and APP_FATAL -> LOG_FATAL. 21 November 2020, 01:13:24 UTC
978a713 Merge branch 'interpret_nn' of https://github.com/halide/Halide into interpret_nn_memoize 21 November 2020, 01:11:34 UTC
a6981f8 Merge branch 'interpret_nn' of https://github.com/halide/Halide into interpret_nn 21 November 2020, 01:11:10 UTC
25abbf1 Use memoize to optimize filter_tiled. 21 November 2020, 01:08:23 UTC
fb4b5f5 Move app_util::make_unique to its only callsite 21 November 2020, 01:06:10 UTC
b91ac77 Move read/write_entire_file to file_util.h Also, driveby removal of .gitignore 21 November 2020, 01:04:11 UTC
06e5318 Add Quantize op. 21 November 2020, 01:02:46 UTC
e4126f5 Merge branch 'master' into interpret_nn 21 November 2020, 00:41:14 UTC
aa9a8c2 Add apps/interpret_nn to toplevel Makefile 21 November 2020, 00:33:03 UTC
e8b53f8 Revert inadvertent changes in apps/bilalteral_grid 20 November 2020, 23:58:52 UTC
4ea5b3a Revert inadvertent changes in apps/bilalteral_grid 20 November 2020, 23:55:18 UTC
d82ca1a Remove bogus CMake files 20 November 2020, 23:54:35 UTC
6c754cf Change "CMAKE_MODULE_PATH" to "CMAKE_PREFIX_PATH" (#5461) I tried to use instructions for a basic CMake project with a locally downloaded copy of Halide, and got the following error: ``` CMake Error at CMakeLists.txt:9 (find_package): By not providing "FindHalide.cmake" in CMAKE_MODULE_PATH this project has asked CMake to find a package configuration file provided by "Halide", but CMake did not find one. Could not find a package configuration file provided by "Halide" with any of the following names: HalideConfig.cmake halide-config.cmake Add the installation prefix of "Halide" to CMAKE_PREFIX_PATH or set "Halide_DIR" to a directory containing one of the above files. If "Halide" provides a separate development package or SDK, be sure it has been installed. ``` Changing `CMAKE_MODULE_PATH` to `CMAKE_PREFIX_PATH` worked for me. 20 November 2020, 19:53:34 UTC
85f775e Rearrange code to put more commonly read code together. 20 November 2020, 19:41:54 UTC
ff9353d Fix scheduling algorithm and add parallelism parameter. 20 November 2020, 19:03:18 UTC
back to top