11ef7b7 | aekul | 10 August 2020, 15:37:46 UTC | Fix inlining condition and add debug printing for long inlined producer chains | 10 August 2020, 15:37:46 UTC |
d3149f3 | aekul | 10 August 2020, 15:06:37 UTC | Use estimate_all for depthwise_separable_conv and conv_layer | 10 August 2020, 15:06:37 UTC |
34f90eb | aekul | 10 August 2020, 15:06:07 UTC | Print name of next node to be scheduled in CYOS mode | 10 August 2020, 15:06:07 UTC |
6d8ff1b | aekul | 10 August 2020, 14:53:09 UTC | Fix serial tile sizing | 10 August 2020, 14:53:09 UTC |
27734a7 | aekul | 09 August 2020, 15:26:11 UTC | Don't include output files in benchmark args | 09 August 2020, 15:26:11 UTC |
d3158c6 | aekul | 08 August 2020, 22:57:25 UTC | Rename iir_blur_generator -> iir_blur | 08 August 2020, 22:57:25 UTC |
af954fb | aekul | 08 August 2020, 22:56:50 UTC | Use each app's actual input images/parameters when benchmarking autoscheduler samples | 08 August 2020, 22:56:50 UTC |
085a31a | aekul | 08 August 2020, 22:48:28 UTC | Let the cost model decide if a state is doing excessive recompute | 08 August 2020, 22:48:28 UTC |
45bee63 | aekul | 08 August 2020, 22:47:28 UTC | Update iir_blur | 08 August 2020, 22:47:28 UTC |
b7750b2 | aekul | 08 August 2020, 17:49:43 UTC | Merge branch 'standalone_autoscheduler_gpu' of https://github.com/halide/Halide into standalone_autoscheduler_gpu | 08 August 2020, 17:49:43 UTC |
ef7cc0c | Andrew Adams | 08 August 2020, 17:47:33 UTC | Add host_dirty call to synthetic input buffers in rungen | 08 August 2020, 17:47:33 UTC |
644270b | aekul | 08 August 2020, 15:20:32 UTC | Consider options with less parallelism if there are no alternatives | 08 August 2020, 15:20:32 UTC |
0862bad | aekul | 08 August 2020, 02:40:31 UTC | Merge branch 'standalone_autoscheduler_gpu' of https://github.com/halide/Halide into standalone_autoscheduler_gpu | 08 August 2020, 02:40:31 UTC |
38b8122 | aekul | 07 August 2020, 15:40:15 UTC | Add compute_capability to default target | 07 August 2020, 15:40:15 UTC |
2db1fda | aekul | 07 August 2020, 13:42:37 UTC | Fix interpolate naming in collect_stats.sh | 07 August 2020, 13:42:37 UTC |
475659f | Tzu-Mao Li | 06 August 2020, 17:49:54 UTC | Fix IRPrinter test | 06 August 2020, 17:49:54 UTC |
ecbe859 | aekul | 06 August 2020, 16:39:18 UTC | Replace normalize with output in interpolate_generator.cpp | 06 August 2020, 16:39:18 UTC |
778cf09 | aekul | 06 August 2020, 16:34:01 UTC | Move -march to a variable in Makefile | 06 August 2020, 16:34:01 UTC |
02f8387 | aekul | 06 August 2020, 16:31:30 UTC | Add test_state to Makefile | 06 August 2020, 16:31:30 UTC |
1c5d058 | aekul | 06 August 2020, 16:30:45 UTC | Compute num_realizations for unscheduled producers | 06 August 2020, 16:30:45 UTC |
7fc8d57 | aekul | 04 August 2020, 20:09:09 UTC | Only consider valid compute locations for unscheduled producers | 04 August 2020, 20:09:09 UTC |
52312ef | aekul | 04 August 2020, 16:55:00 UTC | Handle chains of inlined stages when staging producers | 04 August 2020, 16:55:00 UTC |
337f317 | aekul | 03 August 2020, 23:39:13 UTC | Fix tiling of stages with 1 dimension | 03 August 2020, 23:39:13 UTC |
b3e7805 | aekul | 03 August 2020, 18:37:48 UTC | Rename features: *_per_vector -> *_per_point | 03 August 2020, 18:37:48 UTC |
01797f9 | aekul | 03 August 2020, 18:33:17 UTC | Remove local features | 03 August 2020, 18:33:17 UTC |
a6e2856 | aekul | 03 August 2020, 14:27:11 UTC | Fix LoopNestParser's inline checking | 03 August 2020, 14:27:11 UTC |
fec9c34 | aekul | 03 August 2020, 13:41:24 UTC | Don't wait for benchmark queue to finish when in training mode | 03 August 2020, 13:41:24 UTC |
84d7fee | aekul | 03 August 2020, 13:20:58 UTC | Reset weights | 03 August 2020, 13:20:58 UTC |
189c291 | aekul | 03 August 2020, 13:20:15 UTC | Assume unscheduled producers are promoted to registers | 03 August 2020, 13:20:15 UTC |
f15198f | aekul | 03 August 2020, 12:54:31 UTC | Fix LoopNestParser inline condition | 03 August 2020, 12:54:31 UTC |
e9cfa25 | aekul | 03 August 2020, 12:53:45 UTC | Tidy up handling of func inlining decisions | 03 August 2020, 12:53:45 UTC |
010d077 | aekul | 02 August 2020, 20:28:39 UTC | Add Util.h | 02 August 2020, 20:28:39 UTC |
121041a | aekul | 02 August 2020, 20:26:51 UTC | Update working set index in cost_model_generator.cpp | 02 August 2020, 20:26:51 UTC |
f7fdf20 | aekul | 02 August 2020, 19:36:10 UTC | Remove per_vector features | 02 August 2020, 20:00:30 UTC |
72f909b | aekul | 02 August 2020, 19:57:53 UTC | Skip stages with 0 requests in extract_features.py | 02 August 2020, 19:57:53 UTC |
ec11fa5 | aekul | 02 August 2020, 19:55:44 UTC | Remove trace from metrics comparison | 02 August 2020, 19:55:44 UTC |
19ab5c1 | aekul | 02 August 2020, 19:54:57 UTC | Add loop nest parser test | 02 August 2020, 19:54:57 UTC |
426c1a5 | aekul | 02 August 2020, 19:51:05 UTC | Fix working_set index in cost_model_generator.cpp | 02 August 2020, 19:51:05 UTC |
9d4e5ed | aekul | 31 July 2020, 22:44:51 UTC | Add CYOS_FROM_FILE mode | 31 July 2020, 22:44:51 UTC |
654fb16 | aekul | 31 July 2020, 14:12:24 UTC | Update Makefile to include tests | 31 July 2020, 14:12:24 UTC |
4a4d45e | aekul | 31 July 2020, 13:42:41 UTC | Don't exit early from generate_gpu_tilings when a single option exceeds the thread limit | 31 July 2020, 13:42:41 UTC |
494382d | aekul | 31 July 2020, 13:37:18 UTC | Fix vectorized loads | 31 July 2020, 13:37:18 UTC |
501b188 | aekul | 28 July 2020, 21:26:36 UTC | Handle vectorized shared memory loads | 28 July 2020, 21:26:36 UTC |
f4cc439 | aekul | 28 July 2020, 16:44:13 UTC | Fix staging of inlined producers | 28 July 2020, 16:44:23 UTC |
3ebd4ea | aekul | 28 July 2020, 15:35:31 UTC | Prune splits of small, odd number extents | 28 July 2020, 15:35:36 UTC |
2459572 | aekul | 28 July 2020, 01:56:46 UTC | Sort metric comparisons by factor | 28 July 2020, 01:56:46 UTC |
b4888ea | aekul | 28 July 2020, 01:53:29 UTC | Account for serial loops outside a func's realization in points_accessed_per_thread | 28 July 2020, 01:53:29 UTC |
5ff1648 | aekul | 27 July 2020, 21:58:01 UTC | Update GlobalMemInfo.h -> GPUMemInfo.h in autoscheduler.inc | 27 July 2020, 21:58:01 UTC |
f0baefb | aekul | 27 July 2020, 21:57:00 UTC | Rename output file in compare_with_metrics.sh | 27 July 2020, 21:57:00 UTC |
9c30c24 | aekul | 27 July 2020, 21:56:24 UTC | Add depthwise_separable_conv to collect_stats.sh | 27 July 2020, 21:56:24 UTC |
5c1a889 | aekul | 27 July 2020, 21:56:01 UTC | Fix lookup key in compare_with_metrics.py | 27 July 2020, 21:56:01 UTC |
2e2d8b3 | aekul | 27 July 2020, 21:41:51 UTC | Fix thread loop index for nested stages | 27 July 2020, 21:41:51 UTC |
de84c10 | aekul | 27 July 2020, 19:21:18 UTC | Retain all Jacobians when analyzing memory access edges | 27 July 2020, 19:21:18 UTC |
06940fb | aekul | 27 July 2020, 02:59:45 UTC | Prune some states with dynamic local memory allocations | 27 July 2020, 02:59:45 UTC |
c51919d | aekul | 27 July 2020, 01:19:00 UTC | Rename GlobalMemInfo.h -> GPUMemInfo.h | 27 July 2020, 01:19:00 UTC |
ac85d64 | aekul | 26 July 2020, 16:32:34 UTC | Fix local mem feature names in cost_model_generator.cpp | 26 July 2020, 16:32:34 UTC |
0c9d180 | aekul | 26 July 2020, 16:29:05 UTC | Move compute_location debug printing | 26 July 2020, 16:29:05 UTC |
48238b3 | aekul | 26 July 2020, 16:28:25 UTC | Update local mem features to use MemInfo and an AccessAccumulator | 26 July 2020, 16:28:25 UTC |
b7e9a57 | aekul | 25 July 2020, 15:23:44 UTC | Add local metrics to compare_with_metrics.py | 25 July 2020, 15:23:44 UTC |
cc72f59 | aekul | 24 July 2020, 21:37:59 UTC | Handle local memory allocations that can be stored in registers | 24 July 2020, 21:37:59 UTC |
ecafcc3 | aekul | 24 July 2020, 17:09:52 UTC | Store benchmark queue in a directory instead of searching for samples | 24 July 2020, 17:09:52 UTC |
255890c | aekul | 24 July 2020, 14:46:27 UTC | Use get_host_target to get the target once | 24 July 2020, 14:46:27 UTC |
aa9ed0b | aekul | 24 July 2020, 14:16:04 UTC | Add metric comparison option to generate_autotune_results.sh | 24 July 2020, 14:16:04 UTC |
4bf074a | aekul | 24 July 2020, 13:59:09 UTC | Add VectorReduce to ExprBranching | 24 July 2020, 13:59:09 UTC |
a1bacf4 | aekul | 24 July 2020, 03:29:14 UTC | Merge branch 'standalone_autoscheduler_gpu' of https://github.com/halide/Halide into standalone_autoscheduler_gpu | 24 July 2020, 03:29:14 UTC |
256d40f | Andrew Adams | 22 July 2020, 22:35:55 UTC | Correct a few more Block statements | 22 July 2020, 22:35:55 UTC |
0954a84 | Andrew Adams | 22 July 2020, 22:32:18 UTC | Merge remote-tracking branch 'origin/abadams/unordered_blocks' into standalone_autoscheduler_gpu | 22 July 2020, 22:32:18 UTC |
18a1385 | Andrew Adams | 22 July 2020, 22:13:52 UTC | Add test that gpu barriers aren't inserted in unordered blocks | 22 July 2020, 22:13:52 UTC |
37e16c8 | Andrew Adams | 22 July 2020, 22:13:41 UTC | Revert to old block printing for C backend | 22 July 2020, 22:13:41 UTC |
25fbec8 | Andrew Adams | 22 July 2020, 21:55:25 UTC | Add an "ordering" flag to the Block node, and use it to emit aliasing metadata | 22 July 2020, 21:55:25 UTC |
3a84528 | Steven Johnson | 22 July 2020, 17:23:44 UTC | Merge pull request #5131 from halide/srj-hvx-check Improve HVX codegen error reporting | 22 July 2020, 17:23:44 UTC |
8918446 | Steven Johnson | 21 July 2020, 23:16:16 UTC | Merge pull request #5133 from halide/alexreinking-patch-1 Get rid of stale Travis CI build info from README. | 21 July 2020, 23:16:16 UTC |
daf7aa7 | Alex Reinking | 21 July 2020, 22:38:25 UTC | Get rid of stale Travis CI build info from README. | 21 July 2020, 22:38:25 UTC |
54f854e | Steven Johnson | 21 July 2020, 18:44:09 UTC | Update CodeGen_Hexagon.cpp | 21 July 2020, 18:44:09 UTC |
0fb2489 | Steven Johnson | 21 July 2020, 18:41:56 UTC | Update CodeGen_Hexagon.cpp | 21 July 2020, 18:41:56 UTC |
26bfbb4 | Steven Johnson | 21 July 2020, 18:06:00 UTC | Update CodeGen_Hexagon.cpp | 21 July 2020, 18:06:00 UTC |
3237682 | Steven Johnson | 21 July 2020, 17:34:16 UTC | Update CodeGen_Hexagon.cpp | 21 July 2020, 17:34:16 UTC |
10b8ba4 | aekul | 21 July 2020, 15:35:41 UTC | Fix stride indices | 21 July 2020, 15:35:41 UTC |
4fcae20 | Steven Johnson | 20 July 2020, 23:47:49 UTC | Improve HVX codegen error reporting If you try to compile HVX standalone code with HL_TARGET=hexagon-32-noos, you will die because necessary glue functions are defined in hvx_64 or hvx_128 but not 'baseline' hvx. Add an assertion check with a helpful error meesage to avoid just segfaulting deep inside LLVM. | 20 July 2020, 23:47:49 UTC |
41a756e | Steven Johnson | 20 July 2020, 21:55:21 UTC | Merge pull request #5129 from halide/srj-mkdir Add a couple of missing 'mkdir' usages in Makefile | 20 July 2020, 21:55:21 UTC |
96ca7f3 | Steven Johnson | 20 July 2020, 21:55:08 UTC | Merge branch 'master' into srj-mkdir | 20 July 2020, 21:55:08 UTC |
a30f220 | Steven Johnson | 20 July 2020, 21:54:54 UTC | Merge pull request #5128 from halide/srj-llvm Fix for trunk LLVM | 20 July 2020, 21:54:54 UTC |
937f797 | Steven Johnson | 20 July 2020, 21:09:28 UTC | Add a couple of missing 'mkdir' usages in Makefile | 20 July 2020, 21:09:28 UTC |
06f535f | aekul | 20 July 2020, 20:44:18 UTC | Add 'factor' to compare_with_metrics.py | 20 July 2020, 20:44:18 UTC |
ee2ca29 | aekul | 20 July 2020, 20:43:45 UTC | Handle tail warps in extract_features.py | 20 July 2020, 20:43:45 UTC |
df33922 | Steven Johnson | 20 July 2020, 20:25:13 UTC | Fix for trunk LLVM PrintMachineCode has been removed in LLVM 12/trunk | 20 July 2020, 20:25:13 UTC |
554e1dd | Andrew Adams | 20 July 2020, 18:10:59 UTC | Merge pull request #5125 from halide/abadams/rungenmain_error Add an error message if you forget to compile RunGenMain with a registration file | 20 July 2020, 18:10:59 UTC |
767d3e9 | aekul | 20 July 2020, 18:01:57 UTC | Fix total number of decisions | 20 July 2020, 18:01:57 UTC |
250cb44 | aekul | 20 July 2020, 18:01:20 UTC | Fix fractional memory access strides | 20 July 2020, 18:01:20 UTC |
c723ebf | aekul | 20 July 2020, 02:48:34 UTC | Fix conv_layer Var naming | 20 July 2020, 02:48:34 UTC |
10e7fb0 | aekul | 18 July 2020, 19:26:54 UTC | Move common testing functionality to test/test.h | 18 July 2020, 19:26:54 UTC |
2c03bfa | aekul | 18 July 2020, 19:20:39 UTC | Handle inlined stages in extract_features.py | 18 July 2020, 19:20:39 UTC |
a879b50 | aekul | 18 July 2020, 19:20:09 UTC | Add debug printing to cost model | 18 July 2020, 19:20:09 UTC |
54d84e3 | aekul | 18 July 2020, 19:16:13 UTC | Add timeout to benchmark loop and only search for samples in current batch | 18 July 2020, 19:16:13 UTC |
eb99441 | Alex Reinking | 17 July 2020, 21:58:04 UTC | Merge pull request #5126 from halide/shoaibkamil/llvm_clone_tag Update README to suggest cloning a release of LLVM, not a branch | 17 July 2020, 21:58:04 UTC |
a900b96 | aekul | 17 July 2020, 21:16:05 UTC | Add bounds along a single edge chain for when computing global/shared load features | 17 July 2020, 21:16:05 UTC |
50c947b | Shoaib Kamil | 17 July 2020, 20:48:39 UTC | Update README to suggest cloning a release of LLVM, not a branch | 17 July 2020, 20:48:39 UTC |
43f94b3 | Andrew Adams | 17 July 2020, 20:38:21 UTC | Add an error message if you forget to compile RunGenMain with a registration file | 17 July 2020, 20:38:21 UTC |
c7393ad | Steven Johnson | 16 July 2020, 17:45:20 UTC | Merge pull request #5122 from halide/srj-clangfmt Upgrade clang-format to v10 | 16 July 2020, 17:45:20 UTC |
33ecc3f | Steven Johnson | 16 July 2020, 17:29:14 UTC | Upgrade clang-format to v10 Upgrade the clang-format checks to clang-format-10, and reformat code accordingly. Also add a way tp specify the clang-format version for `make format`; it defaults to the version of Clang for the current LLVM, but since clang-format doesn't provide stable formatting across versions, this might be wrong. | 16 July 2020, 17:44:49 UTC |