25db7fe | Steven Johnson | 12 October 2018, 20:29:30 UTC | Blacklist async_copy_chain on Windows for now (Issue #3358) | 12 October 2018, 20:29:30 UTC |
3011b07 | Dillon Sharlet | 12 October 2018, 17:04:07 UTC | Merge pull request #3341 from halide/pdb_fix_simd_op_check_vdmpy_hvx fix simd op check vdmpy and vrmpy failures | 12 October 2018, 17:04:07 UTC |
d6bf209 | Pranav Bhandarkar | 12 October 2018, 03:20:45 UTC | Simplify the sorting of loads by using can_prove | 12 October 2018, 03:20:45 UTC |
78673c2 | Steven Johnson | 11 October 2018, 22:09:07 UTC | Merge pull request #3348 from halide/srj-tidy Fix clang-tidy warning in RunGen.h | 11 October 2018, 22:09:07 UTC |
704e214 | Steven Johnson | 11 October 2018, 18:20:31 UTC | Fix clang-tidy warning in RunGen.h | 11 October 2018, 18:20:31 UTC |
7432d64 | Pranav Bhandarkar | 11 October 2018, 17:40:45 UTC | Incorporate review comments from Steven Johnson | 11 October 2018, 17:40:45 UTC |
140b62e | Steven Johnson | 11 October 2018, 17:38:31 UTC | Merge pull request #3345 from halide/opengl_compute_buffer_types_fix Make OpenGL Compute backend handle non-32-bit integer types. | 11 October 2018, 17:38:31 UTC |
814a383 | Steven Johnson | 11 October 2018, 17:35:12 UTC | Merge pull request #3333 from alexreinking/add-overrides Add `override` keyword to compile with `-Wsuggest-override` | 11 October 2018, 17:35:12 UTC |
e173bff | Steven Johnson | 11 October 2018, 17:34:18 UTC | Merge pull request #3346 from halide/srj-def Only emit default-values to metadata when defined | 11 October 2018, 17:34:18 UTC |
518d6d9 | Zalman Stern | 11 October 2018, 06:56:34 UTC | Address review feedback by adding a comment and fixing an erroneous inversion of template arguments. | 11 October 2018, 06:56:34 UTC |
31ffb1a | Steven Johnson | 11 October 2018, 01:14:39 UTC | Merge pull request #3344 from halide/srj-guess2 Add RunGen features | 11 October 2018, 01:14:39 UTC |
7b63305 | Steven Johnson | 11 October 2018, 00:23:33 UTC | Only emit default-values to metadata when defined Our metadata format was designed to emit null fields for def/min/max of scalar values when said values aren't defined; in the case of Generators with Input<SomeScalar>, we were emitting a default of zero (rather than "no default") for the case of no-default-specified. This fixes the issue and updates the test. | 11 October 2018, 00:23:33 UTC |
c096c2e | Z Stern | 11 October 2018, 00:15:55 UTC | Make OpenGL Compute backend handle non-32-bit integer types. | 11 October 2018, 00:15:55 UTC |
0dcae96 | Steven Johnson | 10 October 2018, 23:19:36 UTC | Merge pull request #3342 from halide/srj-guess Minor formatting and code tightening. | 10 October 2018, 23:19:36 UTC |
679b48e | Steven Johnson | 10 October 2018, 23:08:48 UTC | Add RunGen features - Add utilities to calculate elements_out and bytes_out (in addition to pixels_out) - Add an experimental feature to try to guess reasonable values for unspecified inputs. This needs more work to be generally useful (and the problem may be intractable), but is interesting enough that I'd like to check it in for further experimentation. | 10 October 2018, 23:08:48 UTC |
2719010 | Steven Johnson | 10 October 2018, 21:41:57 UTC | Minor formatting and code tightening. | 10 October 2018, 21:41:57 UTC |
8c0d61d | Steven Johnson | 10 October 2018, 21:18:39 UTC | Merge pull request #3340 from halide/srj-mm Fix divisor in matrix_multiplication.cpp (Issue #3335) | 10 October 2018, 21:18:39 UTC |
f0cae9a | Alex Reinking | 10 October 2018, 20:31:12 UTC | Updating Makefile to work with g++ 5.1 and above | 10 October 2018, 20:31:12 UTC |
ff40919 | Steven Johnson | 10 October 2018, 17:59:52 UTC | Merge pull request #3338 from matthiaskramm/doc Clarify documentation for saturating_cast | 10 October 2018, 17:59:52 UTC |
f487cec | Steven Johnson | 10 October 2018, 17:59:00 UTC | Fix divisor in matrix_multiplication.cpp (Issue #3335) | 10 October 2018, 17:59:00 UTC |
87191f5 | Steven Johnson | 10 October 2018, 17:57:10 UTC | Merge pull request #3332 from halide/srj-rg Refactor RunGen | 10 October 2018, 17:57:10 UTC |
87f65f0 | Matthias Kramm | 10 October 2018, 16:56:43 UTC | Clarify documentation for saturating_cast | 10 October 2018, 16:56:43 UTC |
77c01a5 | Zalman Stern | 10 October 2018, 05:45:11 UTC | Merge pull request #3331 from halide/device_buffer_copy_fixes Device buffer copy fixes | 10 October 2018, 05:45:11 UTC |
61b5492 | Pranav Bhandarkar | 10 October 2018, 04:22:44 UTC | Sort mpys in HexagonOptimize.cpp We sort mpys by mpys.first so that simplify can eliminate interleaves if we make_interleave of vector slices or strided loads. | 10 October 2018, 04:22:44 UTC |
36dcd09 | Alex Reinking | 10 October 2018, 01:21:02 UTC | Merge branch 'master' of https://github.com/halide/Halide into add-overrides # Conflicts: # src/AsyncProducers.cpp | 10 October 2018, 01:21:02 UTC |
8dbfd67 | Alex Reinking | 10 October 2018, 01:13:03 UTC | Enable `-Wsuggest-override` for only G++ >= 5.1. Update CMakeLists.txt. Also fixing overrides in BLAS/Eigen benchmarks. | 10 October 2018, 01:13:03 UTC |
512240e | Alex Reinking | 10 October 2018, 00:38:05 UTC | Adding `override` keyword where applicable. Enabling `-Wsuggest-override` Also fixing a build error on recent versions of GCC, namely a newly default warning about catching polymorphic exceptions by value. Lastly, weakened autoscheduler slowdown constants to make `run_tests` work on my ThinkPad T460p. | 10 October 2018, 00:38:05 UTC |
4217973 | Marcos Slomp | 09 October 2018, 23:19:51 UTC | Merge pull request #3305 from halide/slomp-d3d12-fix Direct3D 12 bug-fixes | 09 October 2018, 23:19:51 UTC |
5b1c37a | Steven Johnson | 09 October 2018, 23:11:10 UTC | Merge pull request #3323 from halide/srj-depth Reduce recursion depth in InitializeSemaphores (Issue #3317) | 09 October 2018, 23:11:10 UTC |
905e4cd | Pranav Bhandarkar | 09 October 2018, 22:12:45 UTC | Merge branch 'master' into pdb_fix_simd_op_check_vdmpy_hvx | 09 October 2018, 22:12:45 UTC |
74f5ca0 | Pranav Bhandarkar | 09 October 2018, 22:00:14 UTC | Sort slices in HexagonOptimize.cpp when generating vmpa, vtmpy, vrmpy and vdmpy Expressions of the form (int32x32(slice_vector(t0, 0, 2, 32) * 80) + int32x32(slice_vector(t0, 1, 2, 32) * 33)) + v0 first get converted to v0 + (int32x32(slice_vector(t0, 0, 2, 32) * 80) + int32x32(slice_vector(t0, 1, 2, 32) * 33)) by the new simplifier and then GroupLoopInvariants reassociates the larger expression like so (v0 + int32x32(slice_vector(t0, 1, 2, 32) * 33)) + int32x32(slice_vector(t0, 0, 2, 32) * 80) This messes up our matching for vtmpy, vrmpy, vdmpy and vmpa instructions because the matching logic is sensitive to the order in which the slices appear. Fix this by sorting the slices in ascending order of start lanes. | 09 October 2018, 22:00:14 UTC |
57a6288 | Steven Johnson | 09 October 2018, 21:54:18 UTC | Add argument_kind | 09 October 2018, 21:54:18 UTC |
f73e856 | Z Stern | 09 October 2018, 21:21:54 UTC | Add tests for trying to use halide_buffer_copy to copy to a null host pointer. Fix device_interface implementation of halide_buffer_copy to fix new tests. | 09 October 2018, 21:21:54 UTC |
0089083 | Marcos Slomp | 09 October 2018, 19:56:07 UTC | Merge remote-tracking branch 'remotes/upstream/master' into slomp-d3d12-fix | 09 October 2018, 19:56:07 UTC |
269a79f | Shoaib Kamil | 09 October 2018, 19:54:58 UTC | Merge pull request #3329 from halide/kamil/rebuild_fft_on_libhalide_change Rebuild FFT app when libHalide changes | 09 October 2018, 19:54:58 UTC |
e500725 | Z Stern | 09 October 2018, 19:28:04 UTC | Add a test for cross device halide_buffer_copy where target is host. | 09 October 2018, 19:28:04 UTC |
7d24a30 | Steven Johnson | 09 October 2018, 19:10:12 UTC | Merge branch 'master' into srj-rg | 09 October 2018, 19:10:12 UTC |
fef91d7 | Steven Johnson | 09 October 2018, 19:09:45 UTC | Refactor RunGen Move most of the interesting parts of RunGen.cpp into a (new) RunGen.h file; rename RunGen.cpp -> RunGenMain.cpp, since it now contains mostly just a main() function and a little support code. The motivation here is to be able to re-use this code for other tests and benchmarks downstream. | 09 October 2018, 19:09:45 UTC |
c0b7cb3 | Dillon Sharlet | 09 October 2018, 18:22:18 UTC | Merge pull request #3328 from aankit-ca/HexagonAlignment Hexagon Alignment Failures - modulus_remainder of vector | 09 October 2018, 18:22:18 UTC |
847adc1 | Shoaib Kamil | 09 October 2018, 16:31:54 UTC | Rebuild FFT app when libHalide changes | 09 October 2018, 16:31:54 UTC |
351e5ce | Dillon Sharlet | 09 October 2018, 15:51:44 UTC | Merge pull request #3326 from halide/pdb_hvx_fix_simd_vcmp_gt Fix HVX vcmp.gt tests in simd_op_check.cpp | 09 October 2018, 15:51:44 UTC |
bcc0183 | Pranav Bhandarkar | 09 October 2018, 14:08:57 UTC | Merge branch 'master' into pdb_hvx_fix_simd_vcmp_gt | 09 October 2018, 14:08:57 UTC |
8742bb8 | Ankit Aggarwal | 09 October 2018, 10:40:50 UTC | Hexagon Alignment Failures - modulus_remainder of vector | 09 October 2018, 10:40:50 UTC |
481c3c7 | Pranav Bhandarkar | 08 October 2018, 20:22:05 UTC | Change one HVX test of vcmp.gt so that it is more consistent with other tests | 08 October 2018, 20:22:05 UTC |
8eea861 | Marcos Slomp | 08 October 2018, 19:54:19 UTC | [d3d12] code review: indentation; also eliminated unnecessary function | 08 October 2018, 19:54:19 UTC |
8a92b80 | Pranav Bhandarkar | 08 October 2018, 19:35:39 UTC | Fix HVX vcmp.gt tests in simd_op_check.cpp These tests were failing because the new simplifier was converting (rightly) select(v0 < v1, v0, v1) to min(v0, v1) thus not matching vcmp.gt. | 08 October 2018, 19:35:39 UTC |
f6e7730 | Steven Johnson | 08 October 2018, 18:36:33 UTC | Merge pull request #3321 from halide/srj-rungen3 Add random: pseudo-input to RunGen | 08 October 2018, 18:36:33 UTC |
a0f09b2 | Marcos Slomp | 08 October 2018, 17:48:30 UTC | Merge remote-tracking branch 'remotes/upstream/master' into slomp-d3d12-fix | 08 October 2018, 17:48:30 UTC |
cccd305 | Steven Johnson | 08 October 2018, 16:22:06 UTC | Clarify comment | 08 October 2018, 16:22:06 UTC |
4bce126 | Steven Johnson | 08 October 2018, 16:21:27 UTC | Merge branch 'master' into srj-depth | 08 October 2018, 16:21:27 UTC |
f46f65b | Steven Johnson | 08 October 2018, 16:20:18 UTC | Merge branch 'master' into srj-rungen3 | 08 October 2018, 16:20:18 UTC |
8d5f9fd | Steven Johnson | 08 October 2018, 16:17:54 UTC | Merge pull request #3320 from halide/srj-rungen Add 'identity' initializer to RunGen | 08 October 2018, 16:17:54 UTC |
07005c9 | Steven Johnson | 05 October 2018, 23:27:29 UTC | Reduce recursion depth in InitializeSemaphores (Issue #3317) | 05 October 2018, 23:27:29 UTC |
6e673cc | Steven Johnson | 05 October 2018, 22:20:21 UTC | Merge pull request #3319 from halide/srj-async Fix rewrap in InitializeSemaphores | 05 October 2018, 22:20:21 UTC |
b7ff033 | Steven Johnson | 05 October 2018, 22:02:08 UTC | update README | 05 October 2018, 22:02:08 UTC |
e4ed988 | Steven Johnson | 05 October 2018, 21:58:49 UTC | Add random: pseudo-input to RunGen | 05 October 2018, 21:58:49 UTC |
191fa44 | Steven Johnson | 05 October 2018, 21:10:55 UTC | fix indentation | 05 October 2018, 21:10:55 UTC |
04a8fc2 | Steven Johnson | 05 October 2018, 21:06:24 UTC | Add 'identity' initializer to RunGen | 05 October 2018, 21:06:24 UTC |
bd48225 | Steven Johnson | 05 October 2018, 18:10:15 UTC | Fix rewrap in InitializeSemaphores If lets is nonempty, the rewrap loop never exits. Switch to reverse iteration. | 05 October 2018, 18:10:15 UTC |
a0e001f | Steven Johnson | 05 October 2018, 16:26:28 UTC | Merge pull request #3313 from halide/srj-config2 Revise apps/ to use halide_config.make instead of LLVM_CONFIG | 05 October 2018, 16:26:28 UTC |
fe25fe5 | Steven Johnson | 05 October 2018, 16:26:03 UTC | Merge branch 'master' into srj-config2 | 05 October 2018, 16:26:03 UTC |
fe40116 | Steven Johnson | 05 October 2018, 16:25:23 UTC | Merge pull request #3312 from halide/srj-config Revise apps/ to depend only on distrib/ folder | 05 October 2018, 16:25:23 UTC |
4e1ef61 | Steven Johnson | 05 October 2018, 16:25:00 UTC | Merge pull request #3316 from halide/srj-irmut2 Remove IRMutator entirely | 05 October 2018, 16:25:00 UTC |
1a4d2c2 | Steven Johnson | 05 October 2018, 16:24:32 UTC | Merge branch 'master' into srj-irmut2 | 05 October 2018, 16:24:32 UTC |
f965fb2 | Steven Johnson | 05 October 2018, 16:24:16 UTC | Merge pull request #3315 from halide/srj-irmut Convert IRMutator->IRMutator2 | 05 October 2018, 16:24:16 UTC |
f07151a | Steven Johnson | 05 October 2018, 16:23:51 UTC | Merge pull request #3314 from halide/srj-fft Temporarily remove stack-size-canary from apps/fft | 05 October 2018, 16:23:51 UTC |
a8dbc57 | Steven Johnson | 05 October 2018, 00:02:43 UTC | Remove IRMutator entirely | 05 October 2018, 00:02:43 UTC |
f622aaa | Steven Johnson | 04 October 2018, 23:56:39 UTC | Convert IRMutator->IRMutator2 The only remaining users of IRMutator are in StorageFolding.cpp; convert these to IRMutator2 so that we can (finally) delete IRMutator. | 04 October 2018, 23:56:39 UTC |
7dde237 | Steven Johnson | 04 October 2018, 23:44:27 UTC | Temporarily remove stack-size-canary from apps/fft It looks like AsyncProducer.cpp is consuming more stack than we have available. Temporarily remove the canary to unbreak builds while investigating. | 04 October 2018, 23:44:27 UTC |
1f68e96 | Steven Johnson | 04 October 2018, 23:17:14 UTC | Revise test/apps Makefile to stop using LLVM_CONFIG Use the halide_config.make stub (from https://github.com/halide/Halide/pull/3312) to allow for clean linking without knowledge of LLVM_CONFIG. (Note that this PR is a sub-branch of https://github.com/halide/Halide/pull/3312, thus includes those changes too) | 04 October 2018, 23:17:14 UTC |
ccd0d43 | Steven Johnson | 04 October 2018, 23:06:21 UTC | Revise apps/ to depend only on distrib/ folder Up to now, the apps/ folders all got the privelege to look inside Halide; this revises them to depend only on the contents of the distrib/ folder, as should be the case for all 'normal' Halide-using apps. This patch mostly mimics the approach used by python_bindings/, which already has this assumption. Details include: - Makefile.inc references HALIDE_DISTRIB_PATH rather than HALIDE_SRC_PATH/HALIDE_BIN_PATH - having the test_apps target define HALIDE_DISTRIB_PATH instead of HALIDE_SRC_PATH/HALIDE_BIN_PATH - moving the 'bin' dirs to be inside the apps/ subfolders (so that they are invariant whether built by test_apps or individually and making clean address them too - Fixing some assumptions made in the matlab code - adding tools/halide_malloc_trace.h to the distrib folder This also adds a `halide_config.make` stub to the distrib folder (as we have already been doing for CMake and Bazel), so that we can hopefully remove the use of LLVM_CONFIG in the apps/ support as well. (I'm deliberately avoiding doing that in this PR in order to ensure that just the distrib-only change doesn't break anything on its own.) @dsharletg -- I haven't (yet) tested the Matlab-specific changes; they probably could use a closer look from someone more familiar with that code (ie, you) | 04 October 2018, 23:06:21 UTC |
558cfad | Andrew Adams | 04 October 2018, 17:06:48 UTC | Merge pull request #3308 from halide/fix_runtime_buffer_performance Add ALWAYS_INLINE qualifiers to fix debug mode performance | 04 October 2018, 17:06:48 UTC |
6a3a4bf | Steven Johnson | 04 October 2018, 16:35:00 UTC | Merge pull request #3303 from halide/srj-f16 Add support for Buffer<float16> in Python bindings (Issue #3263) | 04 October 2018, 16:35:00 UTC |
c07f3e8 | Steven Johnson | 04 October 2018, 01:36:36 UTC | Merge branch 'master' into srj-f16 | 04 October 2018, 01:36:36 UTC |
c302f10 | Marcos Slomp | 03 October 2018, 22:02:40 UTC | [d3d12] code review : error return codes | 03 October 2018, 22:02:40 UTC |
590f42c | Marcos Slomp | 03 October 2018, 21:34:15 UTC | [d3d12] code review | 03 October 2018, 21:34:15 UTC |
52a52cc | Marcos Slomp | 03 October 2018, 21:30:36 UTC | [d3d12] code review : shuffling definitions around to accommodate for the buffer_contents() implementation | 03 October 2018, 21:30:36 UTC |
a1dd76a | Marcos Slomp | 03 October 2018, 21:29:22 UTC | [d3d12] code review : scaffolding implementation of buffer_contents() for ReadOnly, WriteOnly and ReadWrite d3d12 buffers | 03 October 2018, 21:29:22 UTC |
5422fe6 | Marcos Slomp | 03 October 2018, 21:05:23 UTC | [d3d12] code review | 03 October 2018, 21:05:23 UTC |
bf7500b | Dillon Sharlet | 03 October 2018, 20:21:29 UTC | Merge pull request #3225 from aankit-ca/hexagon_vgather Hexagon vgather support | 03 October 2018, 20:21:29 UTC |
ebbafd7 | Marcos Slomp | 03 October 2018, 20:18:59 UTC | [d3d12] code review : preserving 'user_context' in hashmap_malloc() and hashmap_free() calls | 03 October 2018, 20:18:59 UTC |
af4f951 | Marcos Slomp | 03 October 2018, 19:56:30 UTC | [d3d12] code review: eliminated extraneous spaces | 03 October 2018, 19:56:30 UTC |
748cb48 | Marcos Slomp | 03 October 2018, 19:52:41 UTC | [d3d12] cleanup | 03 October 2018, 19:52:41 UTC |
ec51098 | Marcos Slomp | 03 October 2018, 18:53:42 UTC | [d3d12] fixed issue with "cleanup_on_error_aottest" : that test expects "halide_malloc()" the be called during "<back-end>_device_and_host_malloc()" in order for the user-provided "my_halide_malloc()" to get called enough times to trigger the expected error behavior -- fortunately, simply calling "halide_default_device_and_host_malloc" is virtually identical to the implementation of "d3d12compute_device_and_host_malloc()" thus far. | 03 October 2018, 18:53:42 UTC |
13394eb | Andrew Adams | 03 October 2018, 18:41:42 UTC | Add ALWAYS_INLINE qualifiers to fix debug mode performance | 03 October 2018, 18:41:42 UTC |
c02750d | Steven Johnson | 03 October 2018, 16:31:59 UTC | Empty | 03 October 2018, 16:31:59 UTC |
cbf88cb | Steven Johnson | 03 October 2018, 16:13:44 UTC | Merge branch 'master' into srj-f16 | 03 October 2018, 16:13:44 UTC |
7c8355c | Ankit Aggarwal | 03 October 2018, 08:46:44 UTC | 1. Added v65 feature check before store_in VTCM. 2. Better failure message for using VTCM without v65. | 03 October 2018, 08:46:44 UTC |
dbf7515 | Marcos Slomp | 02 October 2018, 22:40:29 UTC | added new lines | 02 October 2018, 22:40:29 UTC |
d70c3e9 | Marcos Slomp | 02 October 2018, 22:32:37 UTC | removing some hard-coded constants, and better logging on failure | 02 October 2018, 22:32:37 UTC |
76490ce | Marcos Slomp | 02 October 2018, 22:31:09 UTC | [d3d12] fixed erroneous reinterpreting cast when packing data into groupshared memory datum | 02 October 2018, 22:31:09 UTC |
5569b3a | Dillon Sharlet | 02 October 2018, 19:38:32 UTC | Merge pull request #3304 from halide/pdb_hvx_profiling_fix Fix profiling when a stage is offloaded to HVX. | 02 October 2018, 19:38:32 UTC |
c288889 | Steven Johnson | 02 October 2018, 17:59:27 UTC | Merge branch 'master' into srj-f16 | 02 October 2018, 17:59:27 UTC |
079c244 | Steven Johnson | 02 October 2018, 01:00:59 UTC | Merge pull request #3302 from halide/srj-quietdiv Add quiet_div, quiet_mod to C backend (Issue #3300) | 02 October 2018, 01:00:59 UTC |
bb826d3 | Steven Johnson | 02 October 2018, 00:57:55 UTC | Trivial change for buildbot | 02 October 2018, 00:57:55 UTC |
682f392 | Steven Johnson | 01 October 2018, 23:27:30 UTC | Back out internal_assert changes for quiet_div/mod | 01 October 2018, 23:27:30 UTC |
f338362 | Steven Johnson | 01 October 2018, 23:14:20 UTC | assert denom != 0 in quiet_div, quiet_mod for LLVM too | 01 October 2018, 23:14:20 UTC |
d16a9a9 | Steven Johnson | 01 October 2018, 23:09:49 UTC | change user_error -> internal_assert | 01 October 2018, 23:09:49 UTC |
de59814 | Steven Johnson | 01 October 2018, 22:13:08 UTC | Add support for Buffer<float16> in Python bindings (Issue #3263) | 01 October 2018, 22:13:08 UTC |
b936263 | Marcos Slomp | 01 October 2018, 21:48:59 UTC | fixing MSBuild | 01 October 2018, 21:48:59 UTC |