16e4aa7 | Zalman Stern | 09 August 2018, 07:23:36 UTC | Protoype provided bounds tracking logic for copy_to_host/copy_to_device scheduling directives. | 09 August 2018, 07:23:36 UTC |
f8d360c | Zalman Stern | 02 August 2018, 22:50:04 UTC | Prototype async schedule. | 02 August 2018, 22:50:04 UTC |
0125d70 | Zalman Stern | 02 August 2018, 21:59:46 UTC | Merge branch 'async' into hex_dma2_async | 02 August 2018, 21:59:46 UTC |
6342e6b | Dillon Sharlet | 31 July 2018, 19:54:45 UTC | Merge pull request #3151 from vgundlur/hex-dma2 Hex dma2 | 31 July 2018, 19:54:45 UTC |
adcc4ad | Sahar Fatima | 27 July 2018, 06:23:32 UTC | Changes committes to hexagon_dma and hexagon_dma_pool | 27 July 2018, 06:23:32 UTC |
15efcdd | Sahar Fatima | 27 July 2018, 05:38:46 UTC | Changes in CMakeLists to include the new file | 27 July 2018, 05:38:46 UTC |
3660e00 | VenuGopal Reddy Gundluru | 25 July 2018, 16:12:38 UTC | Fixed Linear Alignment for Small ROIs | 25 July 2018, 16:12:38 UTC |
eec09b8 | VenuGopal Reddy Gundluru | 25 July 2018, 14:33:57 UTC | Merge branch 'hex-dma2' of https://github.com/vgundlur/Halide into hex-dma2 | 25 July 2018, 14:33:57 UTC |
d6ed500 | Sahar Fatima | 25 July 2018, 11:41:07 UTC | Merge remote-tracking branch 'origin/hex-dma2-dmapool' into hex-dma2 The test case for small roi 64,32 is failing. Conflicts: src/runtime/hexagon_dma.cpp | 25 July 2018, 11:41:07 UTC |
44a7489 | Dillon Sharlet | 25 July 2018, 05:14:05 UTC | Merge pull request #3087 from vgundlur/hex-dma2 Hex dma2 | 25 July 2018, 05:14:05 UTC |
16fe362 | Z Stern | 25 July 2018, 02:20:47 UTC | Fix error handling case to not remove job from queue while it has active workers. | 25 July 2018, 02:20:47 UTC |
e308287 | Z Stern | 24 July 2018, 23:17:11 UTC | Remove debugging compile to statement code. | 24 July 2018, 23:17:11 UTC |
2a13e1b | Z Stern | 24 July 2018, 23:15:36 UTC | Add Windows DLLEXPORT to extern in new test. | 24 July 2018, 23:15:36 UTC |
94fd0d0 | Z Stern | 24 July 2018, 21:29:33 UTC | Fix typo in 32-bit version of sync primitives. (Only used in async semaphore handling.) Change "atomic_fetch_and_add_acquire_release" to "atomic_fetch_add_acquire_release" for clarity and consistency. Fix thread ID syscall for 32-bit Linux. | 24 July 2018, 21:29:33 UTC |
b28111e | Zalman Stern | 24 July 2018, 07:09:05 UTC | Turn off debug code. | 24 July 2018, 07:09:05 UTC |
85a002a | Zalman Stern | 24 July 2018, 07:08:10 UTC | Fix typo in debug logging. | 24 July 2018, 07:08:10 UTC |
000739d | Z Stern | 24 July 2018, 06:53:58 UTC | Fix a typo the compiler should have caught. | 24 July 2018, 06:53:58 UTC |
13c61d1 | Z Stern | 23 July 2018, 22:56:57 UTC | Merge branch 'master' into async | 23 July 2018, 22:56:57 UTC |
da6ca71 | Z Stern | 23 July 2018, 22:54:06 UTC | Remove may_block field from task structure as it is no longer used. | 23 July 2018, 22:54:06 UTC |
db24235 | Z Stern | 23 July 2018, 22:40:16 UTC | Further refinements to min_threads accounting. Added assert to make sure thread reserves are respected. Passes all tests now. | 23 July 2018, 22:40:16 UTC |
82723af | Steven Johnson | 23 July 2018, 17:18:43 UTC | Merge pull request #3143 from halide/revert-3030-interleave-bug Revert "Interleave bug" | 23 July 2018, 17:18:43 UTC |
38e5ca1 | Steven Johnson | 23 July 2018, 15:45:53 UTC | Revert "Interleave bug" | 23 July 2018, 15:45:53 UTC |
0e41550 | Sahar Fatima | 23 July 2018, 11:42:58 UTC | Added halide_hexagon_free_l2_pool defination in mock implementation Moved global variable to cpp file | 23 July 2018, 11:42:58 UTC |
500546f | Sahar Fatima | 23 July 2018, 07:43:26 UTC | Commit changes related to putting cache stucture in cpp file Changed the overriden function signature exclusive to Codegen_Hexagon class | 23 July 2018, 07:43:26 UTC |
9df61ed | Z Stern | 22 July 2018, 22:34:48 UTC | Fix a bug in MinThreads where it was using the wrong node in a loop. Gets correct min threads values now. Improve the debug naming of tasks so one does not get "fork.fork.fork.fork.fork" etc. in names. | 22 July 2018, 22:34:48 UTC |
4a71957 | Z Stern | 22 July 2018, 22:32:36 UTC | Buy a return ticket from Clown Town after a nice trip initiated by the conversion to IRMutator2. This is what was causing the fan_in test to fail. | 22 July 2018, 22:32:36 UTC |
c79a47d | Steven Johnson | 21 July 2018, 01:24:07 UTC | Merge pull request #3140 from halide/srj-const Buffer::translated() should be const | 21 July 2018, 01:24:07 UTC |
a516852 | Steven Johnson | 20 July 2018, 19:08:02 UTC | Buffer::translated() should be const | 20 July 2018, 19:08:02 UTC |
6e799b6 | Steven Johnson | 20 July 2018, 18:02:39 UTC | Merge pull request #3030 from halide/interleave-bug Interleave bug | 20 July 2018, 18:02:39 UTC |
046fedd | Z Stern | 19 July 2018, 17:26:42 UTC | Empty commit to trigger post_receive hooks | 19 July 2018, 17:26:42 UTC |
9ba5398 | Z Stern | 19 July 2018, 00:45:25 UTC | Trivial change to poke buildbots. | 19 July 2018, 00:45:25 UTC |
fe72c39 | Steven Johnson | 18 July 2018, 19:11:16 UTC | Merge branch 'master' into interleave-bug | 18 July 2018, 19:11:16 UTC |
6650da5 | Z Stern | 18 July 2018, 18:14:07 UTC | Revert changes to dynamic skip stages logic and move async producer fork pass to after skip stages to prevent skip stages from eliding semaphore operations and introducing deadlocks. Fixes failure of correctness_skip_stages_external_array_functions test. | 18 July 2018, 18:14:07 UTC |
410bde7 | VenuGopal Reddy Gundluru | 18 July 2018, 09:52:18 UTC | Merge branch 'master' of https://github.com/halide/Halide into hex-dma2 | 18 July 2018, 09:52:18 UTC |
bbd338a | Z Stern | 18 July 2018, 00:30:50 UTC | Merge branch 'master' into async | 18 July 2018, 00:30:50 UTC |
274781e | Z Stern | 17 July 2018, 23:15:10 UTC | Remove tabs. | 17 July 2018, 23:15:10 UTC |
af4a354 | Z Stern | 17 July 2018, 23:13:26 UTC | Remove TODO comment per Andrew's input in review. | 17 July 2018, 23:13:26 UTC |
d4df23b | Z Stern | 17 July 2018, 21:28:16 UTC | Remove halide_print override which breaks things on Windows and was only there to get better output ordering when debug logging is on for the test. | 17 July 2018, 21:28:16 UTC |
0da3085 | Z Stern | 17 July 2018, 21:24:46 UTC | Try enabling coroutine test for Windows to see if it passes now. | 17 July 2018, 21:24:46 UTC |
7b2f08e | Z Stern | 17 July 2018, 19:44:05 UTC | Mark call counter as pure since this is required to get the optimization it is testing for. (In general all call_counters likely need to be declared pure, but I'll tackle that later. They're obviously not pure, but generally the side effect is only there to count how many times the thing is called assuming there are no side effects.) | 17 July 2018, 19:44:05 UTC |
8c7e2be | Zalman Stern | 17 July 2018, 19:21:11 UTC | Merge pull request #3129 from halide/opengl_compute_cast_fix Fix cast that does not result in a type change in OpenGL Compute backend. | 17 July 2018, 19:21:11 UTC |
2d5cd6d | Z Stern | 17 July 2018, 17:23:08 UTC | Refine requirements for moving potentially invariant if statements outside of for statements to make sure the condition is pure in addition to not depending on the loop index. | 17 July 2018, 17:23:08 UTC |
c8b5162 | Z Stern | 17 July 2018, 06:16:45 UTC | Merge branch 'master' into async | 17 July 2018, 06:16:45 UTC |
05fbbca | Z Stern | 17 July 2018, 06:05:50 UTC | Remove dead code. | 17 July 2018, 06:05:50 UTC |
7bc00ba | Zalman Stern | 17 July 2018, 06:01:01 UTC | Merge pull request #3117 from halide/srj-output-tuple Allow declaring Output<Buffer<>> with tuples (Issue #2980) | 17 July 2018, 06:01:01 UTC |
93834a1 | Z Stern | 17 July 2018, 01:59:42 UTC | Merge branch 'master' into async | 17 July 2018, 01:59:42 UTC |
77a7af0 | Z Stern | 17 July 2018, 01:54:45 UTC | Implement a new model of thread reserve counting for async. Writing down the design idea here in lieu of hopefully forthcoming design doc: Any Parallel For or Fork that contains one or more Acquire nodes will *itself* occupy a thread. These Acquire nodes can be directly underneath the Parallel For or Fork, or inside a child Parallel For or Fork node. If the Parallel For or Fork node only contains direct Acquire nodes, it consumes one thread. If it contains no Acquire nodes at all, it consumes zero threads. In a Block, the number of threads consumed is the maximum of all its Stmts. A Fork node takes the sum of all its branches because in the worst case, each branch could be executed on a different thread. In addition the Fork consumes zero or one *additional* thread by rule 1. Note that a Fork containing no Fork or Parallel For nodes always takes zero or one thread. A Parallel For consumes as many threads as its closure and the number of iterations that are invoked. There is a question here on how many threads to reserve for the Parallel For. The initial design is to account for it as one job's worth of threads and to allow multiple iterations against the top-level work queue only. Parallel Fors that consume zero threads (no Acquire nodes) operate as they always have: without constraint. A job reserves its thread count from a pool. For top-level jobs, this pool is on the work_queue. For subjobs, it comes from their parent pool, which is initialized to the number of threads the parent reserved from its parent (or the work_queue) -- that is the reserve count for the parent. (There is one potential issue in the current implementation that a top-level halide_do_parallel_tasks does not have a min_threads count for the Fork itself. Thus theone thread that gets added there may not be subtracted from the work queue total in the reservation.) | 17 July 2018, 01:54:45 UTC |
77ce496 | Z Stern | 17 July 2018, 01:53:55 UTC | Filter async_parallel generator out of CPP codegen and rungen tests. | 17 July 2018, 01:53:55 UTC |
f138561 | Z Stern | 17 July 2018, 01:36:17 UTC | Update test which depends on task function types to handle extra argument. | 17 July 2018, 01:36:17 UTC |
1b3b138 | Z Stern | 17 July 2018, 01:35:34 UTC | Add handle type info for semaphore types so C++ codegen of async stuff compiles without warnings. | 17 July 2018, 01:35:34 UTC |
24ecb9f | Zalman Stern | 17 July 2018, 00:33:55 UTC | Comment fix per review feedback. | 17 July 2018, 00:33:55 UTC |
e13f810 | Zalman Stern | 17 July 2018, 00:15:42 UTC | Fix cast that does not result in a type change in OpenGL Compute backend. Previously this resulted in a "BAD ID" being generated. Unrelated tab fix. | 17 July 2018, 00:15:42 UTC |
516bcf7 | Steven Johnson | 16 July 2018, 18:32:33 UTC | Merge branch 'master' into interleave-bug | 16 July 2018, 18:32:33 UTC |
a0c7a39 | Steven Johnson | 16 July 2018, 18:26:01 UTC | Merge branch 'master' into srj-output-tuple | 16 July 2018, 18:26:01 UTC |
6acc274 | Steven Johnson | 16 July 2018, 18:25:45 UTC | Merge pull request #3121 from halide/printer_null_string_protect Protect against NULL strings in printer as these occur in debugging. | 16 July 2018, 18:25:45 UTC |
f39b2dd | Zalman Stern | 15 July 2018, 20:07:44 UTC | Change NULL placeholder string per review feeedback. | 15 July 2018, 20:07:44 UTC |
c248f33 | Zalman Stern | 12 July 2018, 21:21:36 UTC | Protect against NULL strings in printer as these occur in debugging and having to protect against them is a hassle. Can't think of any case where the performance hit is worth worying about. | 12 July 2018, 21:21:36 UTC |
6de5ea1 | Sahar Fatima | 12 July 2018, 07:39:39 UTC | Merge branch 'hex-dma2' into hex-dma2-dmapool Conflicts: Makefile src/runtime/hexagon_dma.cpp src/runtime/hexagon_dma_pool.h | 12 July 2018, 07:39:39 UTC |
4f87c98 | Z Stern | 11 July 2018, 22:26:40 UTC | Indentation fix. | 11 July 2018, 22:26:40 UTC |
47af9ff | Z Stern | 11 July 2018, 22:14:12 UTC | Adjustments to thread accounting to make correctness_async work with low thread counts. (Previously it hung. Also tested generator_aot_async_parallel thread counts from 5 to 128 and it still works.) | 11 July 2018, 22:14:12 UTC |
f820b99 | Sahar Fatima | 11 July 2018, 09:46:59 UTC | Changes made to ensure two APIs are merged into one. | 11 July 2018, 09:46:59 UTC |
ce8f5a9 | Sahar Fatima | 11 July 2018, 09:40:14 UTC | Fixed low_mask of Codegen_Hexagon.cpp and a comment | 11 July 2018, 09:40:14 UTC |
a7d31f7 | Zalman Stern | 11 July 2018, 07:36:26 UTC | Convert AsyncProducer.cpp to IRMutator2. Remove tabs. | 11 July 2018, 07:36:26 UTC |
efca7a6 | Z Stern | 11 July 2018, 01:27:55 UTC | Update comment on error handling in async closures. | 11 July 2018, 01:27:55 UTC |
501b00c | Z Stern | 11 July 2018, 01:20:54 UTC | Remove failed jobs from jobe queue. Clean up failure handling a bit. error_async_require_fail now seems to pass reliably. | 11 July 2018, 01:20:54 UTC |
5240545 | Steven Johnson | 11 July 2018, 00:16:31 UTC | Allow declaring Output<Buffer<>> with tuples (Issue #2980) For AOT, this produces an output buffer for each tuple element. | 11 July 2018, 00:16:31 UTC |
5db3ac1 | Zalman Stern | 10 July 2018, 23:32:03 UTC | Add a test for error in async invokved closure. Fix hang for error in async invokved closure. Code sometimes fails without the error message being printed. Looks like it may be crashing in a thread. Checkpointing so I can debug on Linux and to capture the current state. | 10 July 2018, 23:32:03 UTC |
4556fc3 | Sahar Fatima | 10 July 2018, 11:16:04 UTC | Add scoped mutex lock changes to hexagon_dma_pool.cpp | 10 July 2018, 11:16:04 UTC |
073cc15 | Sahar Fatima | 10 July 2018, 11:02:00 UTC | Forgot to add WEAK for the globals | 10 July 2018, 11:02:00 UTC |
15e34e8 | Sahar Fatima | 10 July 2018, 06:02:12 UTC | File not added | 10 July 2018, 06:02:12 UTC |
40984af | Zalman Stern | 10 July 2018, 01:40:36 UTC | Move to using ScopedValue in a few places. | 10 July 2018, 01:40:36 UTC |
89457a4 | Zalman Stern | 10 July 2018, 01:29:28 UTC | Remove debug(0) comments. | 10 July 2018, 01:29:28 UTC |
ed8d6a0 | Zalman Stern | 10 July 2018, 01:29:07 UTC | Update comments to cover task parent pass through argument. | 10 July 2018, 01:29:07 UTC |
6444dc5 | Zalman Stern | 10 July 2018, 01:13:01 UTC | Remove ."NOTPARALLEL" that was used for debugging. | 10 July 2018, 01:13:01 UTC |
6284e9e | Zalman Stern | 09 July 2018, 23:01:46 UTC | Remove extraneous if statements. | 09 July 2018, 23:01:46 UTC |
e2d2de3 | Zalman Stern | 09 July 2018, 22:49:59 UTC | Remove tabs. | 09 July 2018, 22:49:59 UTC |
daca3d3 | Zalman Stern | 09 July 2018, 22:46:58 UTC | Merge branch 'master' into async | 09 July 2018, 22:46:58 UTC |
0b29cac | Steven Johnson | 09 July 2018, 20:09:43 UTC | Merge pull request #3040 from matthiaskramm/user_context Support user_context in Python extensions | 09 July 2018, 20:09:43 UTC |
99b2923 | Sahar Fatima | 09 July 2018, 11:44:06 UTC | Changes in DMA Pool Logic. Now each virtual engine is assigned a dma engine during its lifetime. | 09 July 2018, 11:44:06 UTC |
5c1f28d | Andrew Adams | 07 July 2018, 16:47:20 UTC | Merge pull request #3106 from halide/includes_fix Add stdio.h includes per https://github.com/halide/Halide/issues/3102 . | 07 July 2018, 16:47:20 UTC |
083534f | Zalman Stern | 07 July 2018, 15:56:04 UTC | Add stdio.h includes per https://github.com/halide/Halide/issues/3102 . | 07 July 2018, 15:56:04 UTC |
beec0b1 | Pranav Bhandarkar | 06 July 2018, 21:35:46 UTC | Incorporate comments from review 1) Add logic in AlignLoads that makes it deal only with loads that have stride 1, 2 or 3. (Had accidently dropped this piece of code) 2) is_aligned now returns true or false only. Remove HexagonAlign enum. | 06 July 2018, 21:35:46 UTC |
811e375 | Sahar Fatima | 06 July 2018, 16:37:09 UTC | Commit in changes to remove default_allocator | 06 July 2018, 16:37:09 UTC |
42b86f9 | Sahar Fatima | 05 July 2018, 08:51:33 UTC | Changed the scope of cache mutex. Changed logic of overflow restricting to 24 bytes. Moved APIs to pool internal header file. | 05 July 2018, 08:51:33 UTC |
0bf9fb8 | Pranav Bhandarkar | 04 July 2018, 15:38:19 UTC | Incorporate review comments | 04 July 2018, 15:38:19 UTC |
c24ca0e | Z Stern | 03 July 2018, 22:51:11 UTC | Remove thread local storage based code to propagate job parent tree. It has been replaced by passing an opaque void * through the compiler generated closure and then back to the call to invoke tasks in the runtime. Remove debugging code as it was cluttering the file and had intertwined dependencies with the thread local storage support. Cleanup some comments. | 03 July 2018, 22:51:11 UTC |
ed778e7 | Sahar Fatima | 03 July 2018, 10:34:39 UTC | DMA Pool added to ensure multiple dma engine usage when parallel directive is invoked. During halide_buffer_copy one dma engine is picked from pool and put back once transfer is finished. This ensures if multiple threads are performing halide_buffer_copy , multiple DMA Engines can be used | 03 July 2018, 10:34:39 UTC |
415b30e | Z Stern | 02 July 2018, 20:42:45 UTC | Merge branch 'master' into async | 02 July 2018, 20:42:45 UTC |
baffac3 | Andrew Adams | 29 June 2018, 22:06:53 UTC | Merge pull request #3084 from halide/fix_3078 Fix #3078 | 29 June 2018, 22:06:53 UTC |
fb8038b | Andrew Adams | 29 June 2018, 22:06:44 UTC | Merge pull request #3075 from halide/fix_3070 Fix #3070 | 29 June 2018, 22:06:44 UTC |
7b58714 | Andrew Adams | 29 June 2018, 22:05:52 UTC | Address review comments | 29 June 2018, 22:05:52 UTC |
359151b | Zalman Stern | 29 June 2018, 21:38:06 UTC | Enable test that now works due to Andrew's fix. | 29 June 2018, 21:38:06 UTC |
600c7ca | Andrew Adams | 29 June 2018, 21:26:15 UTC | Merge branch 'async' of https://github.com/halide/Halide into async | 29 June 2018, 21:26:15 UTC |
e5575bc | Andrew Adams | 29 June 2018, 21:25:57 UTC | Don't accidentally omit other side-effecting stuff in a skip-stages guard We were guarding the semaphore releases. | 29 June 2018, 21:25:57 UTC |
d85016f | Zalman Stern | 29 June 2018, 21:12:56 UTC | Move #if 0 to be around code that hangs to prevent buildbots from hanging on test. | 29 June 2018, 21:12:56 UTC |
cd9cd8d | Zalman Stern | 29 June 2018, 20:58:46 UTC | Add test case for ways to do parallel compute roots. Demonstrates miscompilation in AsyncComputeAt case. | 29 June 2018, 20:58:46 UTC |
b9688c0 | Zalman Stern | 29 June 2018, 17:34:01 UTC | Merge pull request #3092 from jsn1993/shunning-type-fix Fix the type of an index variable | 29 June 2018, 17:34:01 UTC |
603e34f | VenuGopal Reddy Gundluru | 29 June 2018, 17:21:36 UTC | Incorporated review Comments from Dillon | 29 June 2018, 17:21:36 UTC |
3943a51 | VenuGopal Reddy Gundluru | 29 June 2018, 16:38:03 UTC | Incorporated Review Comments from Dillon | 29 June 2018, 16:38:03 UTC |
13ed740 | Shunning Jiang | 29 June 2018, 00:26:54 UTC | Fix the type of an index variable On line 2583, the entry should be pair<string, int>, but it gets pushed into a vector of pair<string, bool>. | 29 June 2018, 00:26:54 UTC |