https://github.com/halide/Halide

sort by:
Revision Author Date Message Commit Date
16e4aa7 Protoype provided bounds tracking logic for copy_to_host/copy_to_device scheduling directives. 09 August 2018, 07:23:36 UTC
f8d360c Prototype async schedule. 02 August 2018, 22:50:04 UTC
0125d70 Merge branch 'async' into hex_dma2_async 02 August 2018, 21:59:46 UTC
6342e6b Merge pull request #3151 from vgundlur/hex-dma2 Hex dma2 31 July 2018, 19:54:45 UTC
adcc4ad Changes committes to hexagon_dma and hexagon_dma_pool 27 July 2018, 06:23:32 UTC
15efcdd Changes in CMakeLists to include the new file 27 July 2018, 05:38:46 UTC
3660e00 Fixed Linear Alignment for Small ROIs 25 July 2018, 16:12:38 UTC
eec09b8 Merge branch 'hex-dma2' of https://github.com/vgundlur/Halide into hex-dma2 25 July 2018, 14:33:57 UTC
d6ed500 Merge remote-tracking branch 'origin/hex-dma2-dmapool' into hex-dma2 The test case for small roi 64,32 is failing. Conflicts: src/runtime/hexagon_dma.cpp 25 July 2018, 11:41:07 UTC
44a7489 Merge pull request #3087 from vgundlur/hex-dma2 Hex dma2 25 July 2018, 05:14:05 UTC
16fe362 Fix error handling case to not remove job from queue while it has active workers. 25 July 2018, 02:20:47 UTC
e308287 Remove debugging compile to statement code. 24 July 2018, 23:17:11 UTC
2a13e1b Add Windows DLLEXPORT to extern in new test. 24 July 2018, 23:15:36 UTC
94fd0d0 Fix typo in 32-bit version of sync primitives. (Only used in async semaphore handling.) Change "atomic_fetch_and_add_acquire_release" to "atomic_fetch_add_acquire_release" for clarity and consistency. Fix thread ID syscall for 32-bit Linux. 24 July 2018, 21:29:33 UTC
b28111e Turn off debug code. 24 July 2018, 07:09:05 UTC
85a002a Fix typo in debug logging. 24 July 2018, 07:08:10 UTC
000739d Fix a typo the compiler should have caught. 24 July 2018, 06:53:58 UTC
13c61d1 Merge branch 'master' into async 23 July 2018, 22:56:57 UTC
da6ca71 Remove may_block field from task structure as it is no longer used. 23 July 2018, 22:54:06 UTC
db24235 Further refinements to min_threads accounting. Added assert to make sure thread reserves are respected. Passes all tests now. 23 July 2018, 22:40:16 UTC
82723af Merge pull request #3143 from halide/revert-3030-interleave-bug Revert "Interleave bug" 23 July 2018, 17:18:43 UTC
38e5ca1 Revert "Interleave bug" 23 July 2018, 15:45:53 UTC
0e41550 Added halide_hexagon_free_l2_pool defination in mock implementation Moved global variable to cpp file 23 July 2018, 11:42:58 UTC
500546f Commit changes related to putting cache stucture in cpp file Changed the overriden function signature exclusive to Codegen_Hexagon class 23 July 2018, 07:43:26 UTC
9df61ed Fix a bug in MinThreads where it was using the wrong node in a loop. Gets correct min threads values now. Improve the debug naming of tasks so one does not get "fork.fork.fork.fork.fork" etc. in names. 22 July 2018, 22:34:48 UTC
4a71957 Buy a return ticket from Clown Town after a nice trip initiated by the conversion to IRMutator2. This is what was causing the fan_in test to fail. 22 July 2018, 22:32:36 UTC
c79a47d Merge pull request #3140 from halide/srj-const Buffer::translated() should be const 21 July 2018, 01:24:07 UTC
a516852 Buffer::translated() should be const 20 July 2018, 19:08:02 UTC
6e799b6 Merge pull request #3030 from halide/interleave-bug Interleave bug 20 July 2018, 18:02:39 UTC
046fedd Empty commit to trigger post_receive hooks 19 July 2018, 17:26:42 UTC
9ba5398 Trivial change to poke buildbots. 19 July 2018, 00:45:25 UTC
fe72c39 Merge branch 'master' into interleave-bug 18 July 2018, 19:11:16 UTC
6650da5 Revert changes to dynamic skip stages logic and move async producer fork pass to after skip stages to prevent skip stages from eliding semaphore operations and introducing deadlocks. Fixes failure of correctness_skip_stages_external_array_functions test. 18 July 2018, 18:14:07 UTC
410bde7 Merge branch 'master' of https://github.com/halide/Halide into hex-dma2 18 July 2018, 09:52:18 UTC
bbd338a Merge branch 'master' into async 18 July 2018, 00:30:50 UTC
274781e Remove tabs. 17 July 2018, 23:15:10 UTC
af4a354 Remove TODO comment per Andrew's input in review. 17 July 2018, 23:13:26 UTC
d4df23b Remove halide_print override which breaks things on Windows and was only there to get better output ordering when debug logging is on for the test. 17 July 2018, 21:28:16 UTC
0da3085 Try enabling coroutine test for Windows to see if it passes now. 17 July 2018, 21:24:46 UTC
7b2f08e Mark call counter as pure since this is required to get the optimization it is testing for. (In general all call_counters likely need to be declared pure, but I'll tackle that later. They're obviously not pure, but generally the side effect is only there to count how many times the thing is called assuming there are no side effects.) 17 July 2018, 19:44:05 UTC
8c7e2be Merge pull request #3129 from halide/opengl_compute_cast_fix Fix cast that does not result in a type change in OpenGL Compute backend. 17 July 2018, 19:21:11 UTC
2d5cd6d Refine requirements for moving potentially invariant if statements outside of for statements to make sure the condition is pure in addition to not depending on the loop index. 17 July 2018, 17:23:08 UTC
c8b5162 Merge branch 'master' into async 17 July 2018, 06:16:45 UTC
05fbbca Remove dead code. 17 July 2018, 06:05:50 UTC
7bc00ba Merge pull request #3117 from halide/srj-output-tuple Allow declaring Output<Buffer<>> with tuples (Issue #2980) 17 July 2018, 06:01:01 UTC
93834a1 Merge branch 'master' into async 17 July 2018, 01:59:42 UTC
77a7af0 Implement a new model of thread reserve counting for async. Writing down the design idea here in lieu of hopefully forthcoming design doc: Any Parallel For or Fork that contains one or more Acquire nodes will *itself* occupy a thread. These Acquire nodes can be directly underneath the Parallel For or Fork, or inside a child Parallel For or Fork node. If the Parallel For or Fork node only contains direct Acquire nodes, it consumes one thread. If it contains no Acquire nodes at all, it consumes zero threads. In a Block, the number of threads consumed is the maximum of all its Stmts. A Fork node takes the sum of all its branches because in the worst case, each branch could be executed on a different thread. In addition the Fork consumes zero or one *additional* thread by rule 1. Note that a Fork containing no Fork or Parallel For nodes always takes zero or one thread. A Parallel For consumes as many threads as its closure and the number of iterations that are invoked. There is a question here on how many threads to reserve for the Parallel For. The initial design is to account for it as one job's worth of threads and to allow multiple iterations against the top-level work queue only. Parallel Fors that consume zero threads (no Acquire nodes) operate as they always have: without constraint. A job reserves its thread count from a pool. For top-level jobs, this pool is on the work_queue. For subjobs, it comes from their parent pool, which is initialized to the number of threads the parent reserved from its parent (or the work_queue) -- that is the reserve count for the parent. (There is one potential issue in the current implementation that a top-level halide_do_parallel_tasks does not have a min_threads count for the Fork itself. Thus theone thread that gets added there may not be subtracted from the work queue total in the reservation.) 17 July 2018, 01:54:45 UTC
77ce496 Filter async_parallel generator out of CPP codegen and rungen tests. 17 July 2018, 01:53:55 UTC
f138561 Update test which depends on task function types to handle extra argument. 17 July 2018, 01:36:17 UTC
1b3b138 Add handle type info for semaphore types so C++ codegen of async stuff compiles without warnings. 17 July 2018, 01:35:34 UTC
24ecb9f Comment fix per review feedback. 17 July 2018, 00:33:55 UTC
e13f810 Fix cast that does not result in a type change in OpenGL Compute backend. Previously this resulted in a "BAD ID" being generated. Unrelated tab fix. 17 July 2018, 00:15:42 UTC
516bcf7 Merge branch 'master' into interleave-bug 16 July 2018, 18:32:33 UTC
a0c7a39 Merge branch 'master' into srj-output-tuple 16 July 2018, 18:26:01 UTC
6acc274 Merge pull request #3121 from halide/printer_null_string_protect Protect against NULL strings in printer as these occur in debugging. 16 July 2018, 18:25:45 UTC
f39b2dd Change NULL placeholder string per review feeedback. 15 July 2018, 20:07:44 UTC
c248f33 Protect against NULL strings in printer as these occur in debugging and having to protect against them is a hassle. Can't think of any case where the performance hit is worth worying about. 12 July 2018, 21:21:36 UTC
6de5ea1 Merge branch 'hex-dma2' into hex-dma2-dmapool Conflicts: Makefile src/runtime/hexagon_dma.cpp src/runtime/hexagon_dma_pool.h 12 July 2018, 07:39:39 UTC
4f87c98 Indentation fix. 11 July 2018, 22:26:40 UTC
47af9ff Adjustments to thread accounting to make correctness_async work with low thread counts. (Previously it hung. Also tested generator_aot_async_parallel thread counts from 5 to 128 and it still works.) 11 July 2018, 22:14:12 UTC
f820b99 Changes made to ensure two APIs are merged into one. 11 July 2018, 09:46:59 UTC
ce8f5a9 Fixed low_mask of Codegen_Hexagon.cpp and a comment 11 July 2018, 09:40:14 UTC
a7d31f7 Convert AsyncProducer.cpp to IRMutator2. Remove tabs. 11 July 2018, 07:36:26 UTC
efca7a6 Update comment on error handling in async closures. 11 July 2018, 01:27:55 UTC
501b00c Remove failed jobs from jobe queue. Clean up failure handling a bit. error_async_require_fail now seems to pass reliably. 11 July 2018, 01:20:54 UTC
5240545 Allow declaring Output<Buffer<>> with tuples (Issue #2980) For AOT, this produces an output buffer for each tuple element. 11 July 2018, 00:16:31 UTC
5db3ac1 Add a test for error in async invokved closure. Fix hang for error in async invokved closure. Code sometimes fails without the error message being printed. Looks like it may be crashing in a thread. Checkpointing so I can debug on Linux and to capture the current state. 10 July 2018, 23:32:03 UTC
4556fc3 Add scoped mutex lock changes to hexagon_dma_pool.cpp 10 July 2018, 11:16:04 UTC
073cc15 Forgot to add WEAK for the globals 10 July 2018, 11:02:00 UTC
15e34e8 File not added 10 July 2018, 06:02:12 UTC
40984af Move to using ScopedValue in a few places. 10 July 2018, 01:40:36 UTC
89457a4 Remove debug(0) comments. 10 July 2018, 01:29:28 UTC
ed8d6a0 Update comments to cover task parent pass through argument. 10 July 2018, 01:29:07 UTC
6444dc5 Remove ."NOTPARALLEL" that was used for debugging. 10 July 2018, 01:13:01 UTC
6284e9e Remove extraneous if statements. 09 July 2018, 23:01:46 UTC
e2d2de3 Remove tabs. 09 July 2018, 22:49:59 UTC
daca3d3 Merge branch 'master' into async 09 July 2018, 22:46:58 UTC
0b29cac Merge pull request #3040 from matthiaskramm/user_context Support user_context in Python extensions 09 July 2018, 20:09:43 UTC
99b2923 Changes in DMA Pool Logic. Now each virtual engine is assigned a dma engine during its lifetime. 09 July 2018, 11:44:06 UTC
5c1f28d Merge pull request #3106 from halide/includes_fix Add stdio.h includes per https://github.com/halide/Halide/issues/3102 . 07 July 2018, 16:47:20 UTC
083534f Add stdio.h includes per https://github.com/halide/Halide/issues/3102 . 07 July 2018, 15:56:04 UTC
beec0b1 Incorporate comments from review 1) Add logic in AlignLoads that makes it deal only with loads that have stride 1, 2 or 3. (Had accidently dropped this piece of code) 2) is_aligned now returns true or false only. Remove HexagonAlign enum. 06 July 2018, 21:35:46 UTC
811e375 Commit in changes to remove default_allocator 06 July 2018, 16:37:09 UTC
42b86f9 Changed the scope of cache mutex. Changed logic of overflow restricting to 24 bytes. Moved APIs to pool internal header file. 05 July 2018, 08:51:33 UTC
0bf9fb8 Incorporate review comments 04 July 2018, 15:38:19 UTC
c24ca0e Remove thread local storage based code to propagate job parent tree. It has been replaced by passing an opaque void * through the compiler generated closure and then back to the call to invoke tasks in the runtime. Remove debugging code as it was cluttering the file and had intertwined dependencies with the thread local storage support. Cleanup some comments. 03 July 2018, 22:51:11 UTC
ed778e7 DMA Pool added to ensure multiple dma engine usage when parallel directive is invoked. During halide_buffer_copy one dma engine is picked from pool and put back once transfer is finished. This ensures if multiple threads are performing halide_buffer_copy , multiple DMA Engines can be used 03 July 2018, 10:34:39 UTC
415b30e Merge branch 'master' into async 02 July 2018, 20:42:45 UTC
baffac3 Merge pull request #3084 from halide/fix_3078 Fix #3078 29 June 2018, 22:06:53 UTC
fb8038b Merge pull request #3075 from halide/fix_3070 Fix #3070 29 June 2018, 22:06:44 UTC
7b58714 Address review comments 29 June 2018, 22:05:52 UTC
359151b Enable test that now works due to Andrew's fix. 29 June 2018, 21:38:06 UTC
600c7ca Merge branch 'async' of https://github.com/halide/Halide into async 29 June 2018, 21:26:15 UTC
e5575bc Don't accidentally omit other side-effecting stuff in a skip-stages guard We were guarding the semaphore releases. 29 June 2018, 21:25:57 UTC
d85016f Move #if 0 to be around code that hangs to prevent buildbots from hanging on test. 29 June 2018, 21:12:56 UTC
cd9cd8d Add test case for ways to do parallel compute roots. Demonstrates miscompilation in AsyncComputeAt case. 29 June 2018, 20:58:46 UTC
b9688c0 Merge pull request #3092 from jsn1993/shunning-type-fix Fix the type of an index variable 29 June 2018, 17:34:01 UTC
603e34f Incorporated review Comments from Dillon 29 June 2018, 17:21:36 UTC
3943a51 Incorporated Review Comments from Dillon 29 June 2018, 16:38:03 UTC
13ed740 Fix the type of an index variable On line 2583, the entry should be pair<string, int>, but it gets pushed into a vector of pair<string, bool>. 29 June 2018, 00:26:54 UTC
back to top