8b05df4 | Yong He | 20 February 2023, 18:17:00 UTC | Add static for loop iteration inference. (#2659) | 20 February 2023, 18:17:00 UTC |
a8da735 | Sai Praveen Bangaru | 18 February 2023, 00:03:52 UTC | Allocate N+1 arrays instead of N to avoid out-of-bounds access when unzipping loops (#2663) | 18 February 2023, 00:03:52 UTC |
92ccc8f | Sai Praveen Bangaru | 17 February 2023, 23:02:58 UTC | AD: More legacy type handling cleanup + user-defined reverse-mode fix (#2662) * WIP: Remove all legacy type checking * Fixed issue with user-defined backward derivatives not bypassing the AD process --------- Co-authored-by: Yong He <yonghe@outlook.com> | 17 February 2023, 23:02:58 UTC |
5cd39d1 | Sai Praveen Bangaru | 17 February 2023, 22:56:07 UTC | AD: Remove the original loop condition upon inversion (#2661) * Remove the original condition upon loop inversion (it's redundant, and causes out-of-bounds accesses) * minor fix (also removed the first loop check skip) * Cleanup unused insts * minor comment fix | 17 February 2023, 22:56:07 UTC |
0516073 | Yong He | 17 February 2023, 21:23:27 UTC | Fixed crash when lowering IR for no_diff struct member. (#2658) * Fixed crash when lowering IR for no_diff struct member. * Improve `setInsertBeforeOrdinaryInst` and `setInsertAfterOrdinaryInst`. --------- Co-authored-by: Yong He <yhe@nvidia.com> | 17 February 2023, 21:23:27 UTC |
79049bc | Sai Praveen Bangaru | 17 February 2023, 21:22:47 UTC | Cleaned up legacy differential type handling + type casting bugfixes (#2660) | 17 February 2023, 21:22:47 UTC |
f253d15 | Sai Praveen Bangaru | 17 February 2023, 17:03:59 UTC | Proper reverse-mode loop handling with splitting + inversion steps (#2656) * Halfway to loop inversion * More progress towards proper loop inversion * More progress towards inverse insts. Only thing left is adding `counter>=0` at the right place * More fixes for inversion step. * Lots more fixes, added primal inst 'hoisting' mechanism as the central method that ensures primal values are placed in the right spot * Loop inversion is now functional * Cleaned up commented code * rename diffCounterVar -> diffCounterParam * minor update * removed some comments and commented code * Switch `IRBuilder(sharedIRBuilder)` to `IRBuilder(moduleInst)` | 17 February 2023, 17:03:59 UTC |
245466d | Yong He | 17 February 2023, 00:44:04 UTC | Remove `SharedIRBuilder`. (#2657) Co-authored-by: Yong He <yhe@nvidia.com> | 17 February 2023, 00:44:04 UTC |
4c4826d | Yong He | 16 February 2023, 21:55:32 UTC | Overhaul global inst deduplication and cpp/cuda backend. (#2654) * Overhaul global inst deduplication and cpp/cuda backend. * Update IR documentation. --------- Co-authored-by: Yong He <yhe@nvidia.com> | 16 February 2023, 21:55:32 UTC |
eda88e5 | Yong He | 16 February 2023, 20:44:32 UTC | Design doc on IR deduplication and best practices when working with IR | 16 February 2023, 20:44:32 UTC |
8667515 | Yong He | 16 February 2023, 01:18:00 UTC | Treat user defined backward derivative function as non differentiable. (#2650) Co-authored-by: Yong He <yhe@nvidia.com> | 16 February 2023, 01:18:00 UTC |
266dc66 | jsmall-nvidia | 15 February 2023, 21:02:11 UTC | Upgrade to GLSLANG 12.0.0 binaries (#2652) * #include an absolute path didn't work - because paths were taken to always be relative. * Upgrade GLSLANG binaries to 12.0.0 | 15 February 2023, 21:02:11 UTC |
f13e080 | jsmall-nvidia | 15 February 2023, 19:32:50 UTC | Upgrade GLSLANG 12.0.0 (#2651) * #include an absolute path didn't work - because paths were taken to always be relative. * Update to glslang 12.0.0. Update SPIRV-Tools SPIRV-Headers. | 15 February 2023, 19:32:50 UTC |
598e07f | jsmall-nvidia | 14 February 2023, 23:30:04 UTC | Preliminary Shader Execution Reordering Doc (#2648) * #include an absolute path didn't work - because paths were taken to always be relative. * Add preliminary Shader Execution Reordering doc. Update target-compatibility docs. * Fix debugBreak. | 14 February 2023, 23:30:04 UTC |
b92a75d | jsmall-nvidia | 14 February 2023, 21:21:07 UTC | Preliminary debugBreak support (#2647) * #include an absolute path didn't work - because paths were taken to always be relative. * Preliminary support for debug break. * Add C++ debug break support. Add details about usage. * Improve debug break test details. * Make HLSL output a comment about no support. * Handle specialize for target assert, without a body if it has spv_instruction/target intrinsic | 14 February 2023, 21:21:07 UTC |
ec49215 | Yong He | 13 February 2023, 19:05:29 UTC | Various auto-diff bug fixes. (#2646) Co-authored-by: Yong He <yhe@nvidia.com> | 13 February 2023, 19:05:29 UTC |
977eb92 | Yong He | 13 February 2023, 18:39:12 UTC | Eliminate `continue` to allow unrolling any loops. (#2645) Co-authored-by: Yong He <yhe@nvidia.com> | 13 February 2023, 18:39:12 UTC |
4dbc74a | Yong He | 13 February 2023, 18:38:14 UTC | Add Loop Unrolling Pass. (#2644) Co-authored-by: Yong He <yhe@nvidia.com> | 13 February 2023, 18:38:14 UTC |
57af2c1 | Yong He | 12 February 2023, 06:48:18 UTC | Update README.md on auto diff feature. | 12 February 2023, 06:48:18 UTC |
77706b7 | Yong He | 11 February 2023, 21:30:18 UTC | Update 07-autodiff.md | 11 February 2023, 21:30:18 UTC |
5bab4ea | Yong He | 11 February 2023, 21:20:37 UTC | Update 07-autodiff.md | 11 February 2023, 21:20:37 UTC |
82c7c78 | Yong He | 11 February 2023, 21:19:15 UTC | Update 07-autodiff.md | 11 February 2023, 21:19:15 UTC |
94b2c67 | Ellie Hermaszewska | 11 February 2023, 05:57:35 UTC | Take into account existing initializer list type when performing coercions (#2641) Fixes https://github.com/shader-slang/slang/issues/2189 | 11 February 2023, 05:57:35 UTC |
c7f486c | Ellie Hermaszewska | 11 February 2023, 04:16:56 UTC | Comment call to vkCreateInstance with a potential pitfall (#2642) Although we could in principle write this explanatory message to stderr, that would entangle this call with the layer search above for what is probably a very unlikely possibility on any normal system. | 11 February 2023, 04:16:56 UTC |
aec57d8 | Yong He | 11 February 2023, 02:46:57 UTC | Fix several autodiff bugs. (#2643) | 11 February 2023, 02:46:57 UTC |
6e7b424 | Yong He | 10 February 2023, 17:01:59 UTC | Fix checking of `[BackwardDerivativeOf]` attribute. (#2640) * Fix checking of `[BackwardDerivativeOf]` attribute. * Fix crash in `canInstHaveSideEffectAtAddress`. * Fix. * Revert fix. * Fix. --------- Co-authored-by: Yong He <yhe@nvidia.com> | 10 February 2023, 17:01:59 UTC |
df02f3f | Sai Praveen Bangaru | 09 February 2023, 22:40:20 UTC | Reverse-mode Loop Support (#2635) * Full loop support now working. MaxItersAttr in progress * Lookup table updates? * Fixed the max iters decoration * Minox fixes & remove superfluous code * fixup warnings * Revert "Lookup table updates?" This reverts commit 7d9b0793fb5239f31d1155776e846dcf1892d8d9. * Update 07-autodiff.md * Change maxiters to MaxIters * Added asserts * Update 07-autodiff.md | 09 February 2023, 22:40:20 UTC |
d911e1b | Sai Praveen Bangaru | 09 February 2023, 22:19:55 UTC | Fixed derivatives for kIROp_Neg and kIROp_Div, added another test (#2639) | 09 February 2023, 22:19:55 UTC |
fbe31ad | Ellie Hermaszewska | 09 February 2023, 05:16:30 UTC | Use stable sort in generation of lookup tables (#2638) * Add Slang::List::stableSort * Use stable sort in generation of lookup tables * Disable newline translation when writing lookup tables | 09 February 2023, 05:16:30 UTC |
6bbd673 | Yong He | 08 February 2023, 21:35:43 UTC | Update 07-autodiff.md | 08 February 2023, 21:35:43 UTC |
f2e564c | Yong He | 08 February 2023, 21:20:17 UTC | Replace \cal with \mathbb (#2637) Co-authored-by: Yong He <yhe@nvidia.com> | 08 February 2023, 21:20:17 UTC |
80b1b37 | Yong He | 08 February 2023, 21:04:32 UTC | Update autodiff documentation with more precise math definitions. (#2636) Co-authored-by: Yong He <yhe@nvidia.com> | 08 February 2023, 21:04:32 UTC |
b1d7dc0 | winmad | 08 February 2023, 06:30:00 UTC | Add backward derivatives for functions in diff.meta.slang (#2633) * WIP: start adding backward derivatives * Overhaul `transposeParameterBlock` to support `inout` params. * Small bug fixes. * Bug fix on differentiable intrinsic specialization. * Fixes. * Run autodiff tests on CPU. * Clean up. * Overhaul `transposeParameterBlock` to support `inout` params. * Small bug fixes. * Bug fix on differentiable intrinsic specialization. * Fixes. * Run autodiff tests on CPU. * Clean up. * More bug fixes., * WIP: working on detach * Arithmetic simplifications and more IR clean up logic. * WIP: adding detach and abs * Fix detach and abs * Fix. * Add IR transform pass for cleaner code emit. * Fix test cases. * Fix type system logic for reference type. * Add backward derivatives for functions that already have forward derivatives * Fix changes --------- Co-authored-by: Yong He <yhe@nvidia.com> Co-authored-by: Lifan Wu <lifanw@nvidia.com> | 08 February 2023, 06:30:00 UTC |
4be623c | Yong He | 08 February 2023, 02:36:35 UTC | Arithmetic simplifications and more IR clean up logic. (#2632) | 08 February 2023, 02:36:35 UTC |
101f164 | Yong He | 07 February 2023, 00:11:49 UTC | Update documentation (#2631) Co-authored-by: Yong He <yhe@nvidia.com> | 07 February 2023, 00:11:49 UTC |
f12d422 | Yong He | 06 February 2023, 23:56:48 UTC | Fix documentation (#2630) Co-authored-by: Yong He <yhe@nvidia.com> | 06 February 2023, 23:56:48 UTC |
49d68bf | Yong He | 06 February 2023, 23:45:17 UTC | Update documentation TOC (#2629) Co-authored-by: Yong He <yhe@nvidia.com> | 06 February 2023, 23:45:17 UTC |
fe5cd24 | Yong He | 06 February 2023, 23:37:47 UTC | Improve autodiff documentation (#2628) Co-authored-by: Yong He <yhe@nvidia.com> | 06 February 2023, 23:37:47 UTC |
7ecdf54 | Yong He | 06 February 2023, 22:44:19 UTC | Fixup documentation (#2627) Co-authored-by: Yong He <yhe@nvidia.com> | 06 February 2023, 22:44:19 UTC |
e84c891 | Yong He | 06 February 2023, 22:35:27 UTC | Add documentation for autodiff feature (#2626) Co-authored-by: Yong He <yhe@nvidia.com> | 06 February 2023, 22:35:27 UTC |
5ede9a3 | Yong He | 06 February 2023, 22:34:19 UTC | GFX: make dispatch commands return error code. (#2625) * GFX: make dispatch commands return error code. * Fix cuda. --------- Co-authored-by: Yong He <yhe@nvidia.com> | 06 February 2023, 22:34:19 UTC |
e893a83 | Yong He | 06 February 2023, 18:07:02 UTC | Fix crash when processing nested switch. (#2624) * Fix crash when processing nested switch. * Clean up. --------- Co-authored-by: Yong He <yhe@nvidia.com> | 06 February 2023, 18:07:02 UTC |
a12c551 | Yong He | 05 February 2023, 04:07:14 UTC | Patch transcription of `inout` non differentiable params. (#2623) | 05 February 2023, 04:07:14 UTC |
228e71d | Yong He | 04 February 2023, 00:44:33 UTC | Overhaul `transposeParameterBlock` to support `inout` params. (#2621) * Overhaul `transposeParameterBlock` to support `inout` params. * Small bug fixes. * Bug fix on differentiable intrinsic specialization. * Fixes. * Run autodiff tests on CPU. * Clean up. * More bug fixes., * Add test coverage on inout param. * Fix language server hinting for transcribed mutable params. --------- Co-authored-by: Yong He <yhe@nvidia.com> | 04 February 2023, 00:44:33 UTC |
ee49a62 | jsmall-nvidia | 03 February 2023, 22:11:12 UTC | Small fixes around repro (#2622) * #include an absolute path didn't work - because paths were taken to always be relative. * Fix issues in repo due to C++ expression evaluation ordering is undefined. | 03 February 2023, 22:11:12 UTC |
1890836 | Ellie Hermaszewska | 03 February 2023, 04:19:10 UTC | Correct indentation in premake lua (#2620) No semantic change | 03 February 2023, 04:19:10 UTC |
a00dc69 | Ellie Hermaszewska | 03 February 2023, 04:18:49 UTC | Use SPIR-V opcode names rather than numbers (#2571) * s/emititng blobal/emitting global * Use SPIR-V opcode names rather than numbers * regenerate Visual Studio project files * Use names for extended SPIR-V GLSL instructions * Add missing operand for SPIR-V extended instruction * Add warning aginst modifying generated hashing files * Squash warnings on MSVC | 03 February 2023, 04:18:49 UTC |
bbd1e17 | Yong He | 01 February 2023, 22:18:57 UTC | Support `out` parameters in backward differentiation. (#2619) * Support `out` parameters in backward differentiation. * Fixes. * Fix cleanup. --------- Co-authored-by: Yong He <yhe@nvidia.com> | 01 February 2023, 22:18:57 UTC |
c5895fb | Ellie Hermaszewska | 01 February 2023, 15:32:40 UTC | Use gmake2 as a premake target over gmake (#2587) The gmake generator has been deprecated by gmake2 https://premake.github.io/docs/Using-Premake/#using-premake-to-generate-project-files gmake2 has better dependency handling around our custom rules leading in fewer runs of slang-generate etc... | 01 February 2023, 15:32:40 UTC |
e312d5c | Sai Praveen Bangaru | 31 January 2023, 08:26:59 UTC | Patched support for multi-return and fallthrough if-else with break stmts (#2617) | 31 January 2023, 08:26:59 UTC |
77cdbb2 | Yong He | 31 January 2023, 04:03:46 UTC | Add transposition logic for constructor opcodes. (#2618) * Add transposition logic for constructor opcodes. * Fix. * Add language server regression test. --------- Co-authored-by: Yong He <yhe@nvidia.com> | 31 January 2023, 04:03:46 UTC |
499b025 | Yong He | 31 January 2023, 03:24:09 UTC | Make ArrayExpressionType a DeclRefType and define its autodiff extension in stdlib. (#2615) * Allow array parameters in forward diff. * Use type canonicalization instead of coersion. * Reimplement array type. * Fix. * Update test case. --------- Co-authored-by: Yong He <yhe@nvidia.com> | 31 January 2023, 03:24:09 UTC |
134dd7e | Sai Praveen Bangaru | 30 January 2023, 16:46:36 UTC | Overhauled reverse-mode control flow handling (#2608) * Added switch-case support; fixed non-diff parameter transposition * Made region propagation much more robust. Partial loop unzip implementation * WIP: Added most loop handling code, and a test. Still untested * Added CFG Normalization pass + CFG Reversal Pass + Loop Unzipping + most loop transcription * Add single-iter-loop test. * proj files * removed comments * Update reverse-loop.slang * Removed out-of-date code * Disabled IR validation during constructSSA phase of normalizeCFG. constructSSA now reuses sharedBuilder * Moved normalizeCFG() call to prepareFuncForBackwardDiff() | 30 January 2023, 16:46:36 UTC |
4a66e97 | Yong He | 28 January 2023, 00:41:31 UTC | Register allocation during phi elimination. (#2613) * Register allocation during phi elimination. * Enhance the test case. * Cleanup line breaks in test case. * remove unncessary line break changes. * More cleanups. --------- Co-authored-by: Yong He <yhe@nvidia.com> | 28 January 2023, 00:41:31 UTC |
93a6b61 | skallweitNV | 27 January 2023, 19:53:57 UTC | Add ASAN support + fixes (#2614) * Add ASAN support to premake * Fix StringRepresentation when ASAN is enabled * Fix deep recursion in slang-generate * Fix hello-world example * Fix gpu-printing example * Linux fix * Try fixing linux * Add missing include | 27 January 2023, 19:53:57 UTC |
9f6b6fb | skallweitNV | 27 January 2023, 02:48:12 UTC | Format premake5.lua (#2612) | 27 January 2023, 02:48:12 UTC |
1f4c7ca | Yong He | 26 January 2023, 01:27:40 UTC | Unify UpdateField and UpdateElement with access chain. (#2611) * Unify UpdateField and UpdateElement with access chain. * Fix warnings. Co-authored-by: Yong He <yhe@nvidia.com> | 26 January 2023, 01:27:40 UTC |
aa6814b | Yong He | 25 January 2023, 22:48:01 UTC | Cleanup IR representation of interface member derivative. (#2610) Co-authored-by: Yong He <yhe@nvidia.com> | 25 January 2023, 22:48:01 UTC |
ae11538 | skallweitNV | 25 January 2023, 16:48:55 UTC | GFX report live objects (#2609) * Add utility to call D3D ReportLiveObjects * Add gfxReportLiveObjects API call * Only warn on swapchain image references | 25 January 2023, 16:48:55 UTC |
951ad25 | Yong He | 25 January 2023, 06:16:21 UTC | Reimplement address elimination. (#2605) * Reimplement address elimination pass. * Fix error. * Update test references. Co-authored-by: Yong He <yhe@nvidia.com> | 25 January 2023, 06:16:21 UTC |
a3b0eff | jsmall-nvidia | 24 January 2023, 17:04:14 UTC | Small fix for "static" in doc output (#2606) * #include an absolute path didn't work - because paths were taken to always be relative. * Upgrade to slang-llvm-13.x-33 * Kick - as build failed on download egress. * Output "static" on methods in doc output. | 24 January 2023, 17:04:14 UTC |
46a4d98 | Yong He | 23 January 2023, 14:59:25 UTC | Full address insts elimination for backward autodiff. (#2604) Co-authored-by: Yong He <yhe@nvidia.com> | 23 January 2023, 14:59:25 UTC |
263ca18 | skallweitNV | 20 January 2023, 21:17:14 UTC | Add vulkan extensions to support DLSS (#2603) | 20 January 2023, 21:17:14 UTC |
6fae15c | Yong He | 19 January 2023, 16:58:20 UTC | Add diagnostic for calling non-bwd-diff func from bwd-diff func. (#2602) | 19 January 2023, 16:58:20 UTC |
0586f32 | jsmall-nvidia | 18 January 2023, 19:11:50 UTC | Upgrade slang-llvm-13.x-33 (#2600) * #include an absolute path didn't work - because paths were taken to always be relative. * Upgrade to slang-llvm-13.x-33 * Kick - as build failed on download egress. | 18 January 2023, 19:11:50 UTC |
86ddb9c | Yong He | 18 January 2023, 06:19:10 UTC | First custom backward-derivative test case working. (#2598) | 18 January 2023, 06:19:10 UTC |
a0994a8 | jsmall-nvidia | 18 January 2023, 05:01:58 UTC | Add `set` to spirv_instruction (#2597) | 18 January 2023, 05:01:58 UTC |
1a48681 | Sai Praveen Bangaru | 18 January 2023, 01:21:01 UTC | Added switch-case support; fixed non-diff parameter transposition (#2596) | 18 January 2023, 01:21:01 UTC |
2c43749 | Sai Praveen Bangaru | 15 January 2023, 20:00:20 UTC | Switched to a much simpler method to transpose control flow, nested control flow works now (#2595) | 15 January 2023, 20:00:20 UTC |
1c9b331 | Yong He | 15 January 2023, 06:50:57 UTC | Support custom backward derivative attribute. (#2594) | 15 January 2023, 06:50:57 UTC |
14fab67 | Theresa Foley | 14 January 2023, 23:31:31 UTC | Fixes for crash when inlining at global scope (#2593) * Fixes for crash when inlining at global scope Recent changes to the way inlining is implemented in the Slang compiler have broken certain scenarios involving `static const` declarations. The basic problem is that the initial-value expression for a `static const` gets lowered into IR code at the global scope of a module, and if that code includes `call`s to stdlib operations marked `forceInlineEarly`, then we end up trying to apply inlining to code at module scope. The current inlining operation assumes that all `call`s are in basic blocks, and that the correct way to do inlining involves splitting those blocks. This change adds logic to detect when the callee at a call site to be inlined consists of a single basic block ending in a `return`, and in that case it invokes specialized inlining logic that doesn't split basic blocks and doesn't need to care if the original `call` is in a basic block. Thus we are able to inline calls to single-basic-block `forceInlineEarly` functions called as part of the initialization for global-scope `static const` variables. This logic does *not* solve the problem of calls to multi-block `forceInlineEarly` functions from the global scope. Such calls cannot really be inlined. A secondary problem that arises when inlining such calls is that the callee might include local temporaries (`var` instructions) that are read and written (`load`s and `store`s), and none of those instructions should be allowed at the global scope. In the case of the functions being inlined here, the `load`/`store` operations are superfluous, and should be cleaned up by our SSA pass. The only reason that they seem to *not* be getting cleaned up in the case that was been triggering crashes is that the callee is a generic. The current logic for the SSA pass was skipping the bodies of generic functions, so they would not be cleaned up. This change enables the SSA pass to apply to the bodies of generic functions, and also ensures that SSA cleanups are applied *before* any `forceInlineEarly` functions get inlined. * fixup: liveness test outputs | 14 January 2023, 23:31:31 UTC |
4adc64f | Yong He | 13 January 2023, 19:48:54 UTC | Frontend work for `[BackwardDerivative]` and `[BackwardDerivativeOf]`. (#2589) * Frontend work for `[BackwardDerivative]` and `[BackwardDerivativeOf]`. * Fix clang issue. * Fix. * fix gcc issue * fix formatting. Co-authored-by: Yong He <yhe@nvidia.com> | 13 January 2023, 19:48:54 UTC |
63b874d | jsmall-nvidia | 12 January 2023, 22:11:42 UTC | Fix issue around linking/obfuscation (#2588) * #include an absolute path didn't work - because paths were taken to always be relative. * Work around for some issue seen with a repro. * Small improvement in doing IDifferentable check. * Fix around obfuscation linkage. | 12 January 2023, 22:11:42 UTC |
a3ac6e7 | Yong He | 11 January 2023, 23:33:28 UTC | Make backward differentiation work with generics. (#2586) * Make backward differentiation work with generics. * Fix. * Another fix. * More fix. Co-authored-by: Yong He <yhe@nvidia.com> | 11 January 2023, 23:33:28 UTC |
2026268 | jsmall-nvidia | 10 January 2023, 22:01:24 UTC | Small fixes around repro loading/autodiff (#2585) * #include an absolute path didn't work - because paths were taken to always be relative. * Work around for some issue seen with a repro. * Small improvement in doing IDifferentable check. | 10 January 2023, 22:01:24 UTC |
2f42208 | Yong He | 10 January 2023, 20:42:55 UTC | Nested bwd-diff func call context save/restore. (#2584) Co-authored-by: Yong He <yhe@nvidia.com> | 10 January 2023, 20:42:55 UTC |
eb813fb | jsmall-nvidia | 09 January 2023, 15:27:57 UTC | Small fixes to cuda-target.md | 09 January 2023, 15:27:57 UTC |
39f1e4a | jsmall-nvidia | 09 January 2023, 15:26:13 UTC | Fix typo in CUDA target docs | 09 January 2023, 15:26:13 UTC |
b985b1b | jsmall-nvidia | 06 January 2023, 22:20:42 UTC | Fix small issue around emitInterpolationModifiersImpl when layout is nullptr. (#2583) * #include an absolute path didn't work - because paths were taken to always be relative. * Fix output when layout is nullptr in emitInterpolatioModifiersImpl | 06 January 2023, 22:20:42 UTC |
33fb959 | Yong He | 06 January 2023, 21:39:06 UTC | Split bwd_diff op into separate ops for primal and propagate func. (#2582) * Split bwd_diff op into separate ops for primal and propagate func. * Fix. * Download swiftshader with github actions instead of curl on linux. * Fix github action. Co-authored-by: Yong He <yhe@nvidia.com> | 06 January 2023, 21:39:06 UTC |
e70cbe7 | Ellie Hermaszewska | 06 January 2023, 19:02:47 UTC | Fix validation errors (and hang) in swapchain resize test (#2578) * Use same format as swapchain for framebuffer in swapchain resize test * Use correct resource state for vertex buffer in swapchain resize test * Call acquireNextImage before drawing to fix validation error in swapchain resize test | 06 January 2023, 19:02:47 UTC |
7f64b2a | Sai Praveen Bangaru | 04 January 2023, 18:10:13 UTC | Multi-block reverse-mode autodiff (#2576) * Initial multi-block implementation * Implemented multi-block reverse-mode (without loops) * Added logic to remove block-level decorations to avoid confusing IR simplification passes * Fixed issues with block-level decorations during IR simplification by removing them prior to simplification. Co-authored-by: Yong He <yonghe@outlook.com> | 04 January 2023, 18:10:13 UTC |
e8f977a | Ellie Hermaszewska | 04 January 2023, 12:02:04 UTC | Avoid dots in auto-detected filename extensions (#2566) Supersedes #2532 | 04 January 2023, 12:02:04 UTC |
57e9786 | Ellie Hermaszewska | 04 January 2023, 12:01:42 UTC | Add format checking attributes on printf-like functions (#2570) * Add format checking attributes on printf-like functions * Don't use printf format attributes on msvc Where they are not supported | 04 January 2023, 12:01:42 UTC |
6dbdb74 | Yong He | 21 December 2022, 23:25:38 UTC | Further unify the autodiff passes. (#2574) * Further unify the autodiff passes. * Fix clang compilation error. * Rename ForwardDerivativeTranscriber->ForwardDiffTranscriber. * Remove unused fields from Transcriber classes. * More small cleanups. * Cleanup. Co-authored-by: Yong He <yhe@nvidia.com> | 21 December 2022, 23:25:38 UTC |
8878429 | Yong He | 19 December 2022, 20:36:39 UTC | Update to checkout@v3 (#2572) Co-authored-by: Yong He <yhe@nvidia.com> | 19 December 2022, 20:36:39 UTC |
216dfba | Yong He | 19 December 2022, 19:47:19 UTC | Separate primal computations from unzipped function into an explicit function. (#2569) Co-authored-by: Yong He <yhe@nvidia.com> | 19 December 2022, 19:47:19 UTC |
36220da | Ellie Hermaszewska | 19 December 2022, 16:20:58 UTC | s/TRACTING/TRACING/ (#2567) Closes #2561 | 19 December 2022, 16:20:58 UTC |
145a0f6 | Ellie Hermaszewska | 19 December 2022, 16:20:24 UTC | Correct user guide's section on preprocessor directives (#2565) | 19 December 2022, 16:20:24 UTC |
1c2c490 | Yong He | 14 December 2022, 17:37:55 UTC | Fix code generation for matrix reshape. (#2568) Co-authored-by: Yong He <yhe@nvidia.com> | 14 December 2022, 17:37:55 UTC |
5ce8d4c | skallweitNV | 14 December 2022, 17:11:01 UTC | Shader cache improvements (#2564) * Make shader cache tests check the output buffer * Add shader cache eviction test * Cleanup comments * Improve TestReporter thread safety * Split lockFile test into two tests * Cleanup PersistentCache tests * Disable multi-threaded tests on aarch64 | 14 December 2022, 17:11:01 UTC |
9d04835 | Sai Praveen Bangaru | 12 December 2022, 22:33:44 UTC | Added support for nested calls (#2562) * Added initial support for nested calls * removed comments Co-authored-by: Yong He <yonghe@outlook.com> | 12 December 2022, 22:33:44 UTC |
c2dc1a8 | skallweitNV | 12 December 2022, 18:25:48 UTC | Refactor shader cache (#2558) * Fix a bug in Path::find * Fix code formatting * Fix LockFile and add LockFileGuard * Add PersistentCache and unit test * Replace file path dependency list with source file dependency list * Add note on ordering in Module/FileDependencyList * Remove old shader cache code * Refactor shader cache implementation * Temporarily skip unit tests reading/writing files * Fix warning * Reenable lock file test * Rename shader cache tests and disable crashing test * Testing * Stop using Path::getCanonical * Fix persistent cache lock and test * Fix threading issues * Move adding file dependency hashes to getEntryPointHash() * Fix handling of #include files * Allow specifying additional search paths for gfx testing device * Work on shader cache tests * Update project files * Revive shader cache graphics tests * Split graphics pipeline test * Fix compilation | 12 December 2022, 18:25:48 UTC |
8d359fc | Yong He | 09 December 2022, 17:09:53 UTC | Add `diffPair` stdlib function. (#2560) | 09 December 2022, 17:09:53 UTC |
41eb19e | Yong He | 08 December 2022, 22:56:20 UTC | Auto-diff for matrix operations. (#2559) Co-authored-by: Yong He <yhe@nvidia.com> | 08 December 2022, 22:56:20 UTC |
468bb7e | Sai Praveen Bangaru | 08 December 2022, 16:50:55 UTC | More type support for reverse-mode (#2551) * Add vector arithmetic test. Make gradient accumulation work for any IRLoad * Added support for general vector types, and split transposition into transpose & materialize to allow emitting the fully accumulated gradient for complex types. * Several bug fixes + finished up support for vector & struct types + removed prop pass * minor fixes (int/uint casts) * Removed IRConstruct * Added some type casts to prevent warnings * minor fix for unused variable | 08 December 2022, 16:50:55 UTC |
53e891e | Yong He | 07 December 2022, 21:42:48 UTC | Rename IR opcodes to unify style. (#2556) Co-authored-by: Yong He <yhe@nvidia.com> | 07 December 2022, 21:42:48 UTC |
7071470 | Yong He | 07 December 2022, 20:52:20 UTC | Remove `construct` IR op. (#2555) Co-authored-by: Yong He <yhe@nvidia.com> | 07 December 2022, 20:52:20 UTC |
3a3a8b5 | Yong He | 07 December 2022, 20:02:30 UTC | Lower-to-ir no longer produce `Construct` inst. (#2553) Co-authored-by: Yong He <yhe@nvidia.com> | 07 December 2022, 20:02:30 UTC |
f116f43 | skallweitNV | 07 December 2022, 16:21:22 UTC | Make slang-test depend on test tool libraries (#2554) | 07 December 2022, 16:21:22 UTC |