https://github.com/shader-slang/slang

sort by:
Revision Author Date Message Commit Date
1b89f78 Capabilities System, CapabilitySet Logic Overhaul (#4145) * Capabilities System, Backing Logic Overhaul Fixes #4015 Problems to address: 1. Currently the capabilities system spends anywhere from 25-50% of compile time on the CapabilityVisitor. Most of this time is spent on join logic: 1. Finding abstract atoms 2. Comparing list1<->list2. This should and can be made significantly faster. 2. Error system does not produce errors with auxiliary information. This will require a partial redesign to provide more useful semantic information for debugging. What was addressed: 1. Array backed `CapabilityConjunctionSet` was replaced in-favor for a `UIntSet` backed `CapabilityTargetSets`. The design is described below. Design: * `CapabilityTargetSets` is a `Dictionary<targetAtom, CapabilityTargetSet>`. This is not an array for 2 reasons: 1. Easy to figure out which target is missing between two `CapabilityTargetSets` 2. To statically allocate an array requires the preprocessor to manually annotate which Capability is a target and link that Capability to an index. This means a dictionary is required for lookup regardless of implementation. * `CapabilityTargetSet` is an intermediate representation of all capabilities for a singular `target` atom (`glsl`, `hlsl`, `metal`, ...). This structure contains a dictionary to all stage specific capability sets for fast lookup of stage capabilities supported by a `CapabilitySet` for a `target` atom. This reduces number of sets searched. * `CapabilityStageSet` is an intermediate representation of all capabilities for a singular `stage` atom (`vertex`, `fragment`, ...). This structure holds all disjoint capability sets for a `stage`. A disjoint set is rare, but may exist in some scenarios (as an example): `{glsl, EXT_GL_FOO}{glsl, _GLSL_130, _GLSL_150}`. This reduces the number of sets searched. * `UIntSet` is the main reason for the redesign for better performance and memory usage. All set operations only require a few operations, making all set logic trivial and with minimal cost to run. All algorithms were modified to focus around `UIntSet` operations. 2. Errors * Semantic information are now better linked to the calling function to provide a connection of function<->function_body for when saving semantic information for errors. * Missing targets now print errors much like other error code by finding code which could be a cause of incompatibility. What is missing: 1. Add non naive support for non-stage specific capabilities such as `{hlsl, _sm_5_0}`. Currently non stage specific targets emulate the behavior through assigning such capabilities to every stage: `{hlsl, _sm_5_0, vertex} {hlsl, _sm_5_0, fragment}...`. Removal of this behavior would remove redundant shader stage sets being made at construction time (~80% of new implementation runtime). This is an addition, not an overhaul. 2. Optionally: `UIntSet` should be modified to support SIMD operations for significantly faster operations. This is not required immediately since `UIntSet` is already not a performance constraint. Notes: * UIntSet had implementation bugs which were fixed in this PR. * The old capabilities system had bugs which were fixed in this PR when transforming to the new implementation. * fix .natvis debug view * Small optimizations I found while working on the addition the AST building pass looks like so now: 1% = ~capabilitySet 2% = capabilitySet() 1.5% capabilitySet::unionWith() 0.8% capabilitySet::join() 1.5% auxillary info for debugging ~0.5-1% extra visitor overhead ~5% total for the visitor ~6.5% for total runtime costs * fix caps which were wrong but worked * push minor syntax fix (still looking for why other tests fail) * perf & bug fixes 1. did not properly remake isBetterForTarget for this->empty case with that as Invalid. This is best case in this senario. 2. Remade seralizer for stdlib generation. Faster (more direct) & cleaner code. NOTE: did not address review comments * fix glsl.meta caps error * fixing findBest logic again & UIntSet wrapper findBest was not checking for 'more specialized' targets & was element counter was flawed * faster getElements algorithm + natvis for UIntSet + wrong warning * type incompatability of bitscanForward implementations * try to fix warnings again * remove ptr for clang intrinsic * add missing header * ifdef to allow clang compile * compiler hackery to fix up platform/type independent operations * bracket * fix MSVC error * missing template * change types out again * changes to fix compiling * adjustment to parameter for Clang/GCC * added iterator to delay processing all atomSets of a CapabilitySet * add a few missing consts's * ensure we never have more than 1 disjointSet Added a wrapper + assert + union functionality to all possible disjoint sets. This was done in favor of a removal of the LinkedList for 2 reasons: 1. We still need 0-1 set functionality. 2. Might as well keep the code, just disallow the problematic functionality. * address review comments non linked-list refactor review comments addressed; add doc comments + remove redundant code * comments + remove isValid for bool operator * push removal of linkedlist for capabilities * add missing break * address review comments minor adjustments of syntax * push a fix to the `CapabilitySet({shader, missing target})` code * quality + error 1. add iterator to UIntSet 2. do not specialize target_switch if profile is derived from case (GLSL_150 is not compatable with GLSL_400) * fix target_switch erroring + temporarily remove UIntSet::Interator temporarily remove UIntSet::Interator. It will be added after, testing code on CI first so I can multi-task fixing the UIntSet Iterator * fix the UIntSet iterator * Revert "fix the UIntSet iterator" temporarily to pull from master * add metal error as per texture.slang (took a while I realize this was why things were breaking, likely should adjust errors to reflect this) * Rework UIntSet to have a template for output type This is done so it is reasonable to debug the iterator output and not just dealing with messy int's Fix problems with the iterators implemented + invalid capabilities handling * removed incorrect `__target_switch` capability barycentric was being used with anticipation of `profile glsl450`, this does not expand into `GL_EXT_fragment_shader_barycentric`, this instead caused an error which is hidden during cross-compile. * remove some uses of getElements * remove undeclared_stage for now * remove redundant code associated with `undeclared_stage` * remove unused variable * address review specifically to note removed static in a thread dangerous scope. Now using a `const static` for read only (thread safe) which precompile steps generate * move GLSL_150 capdef change to sm_4_1 (more accurate) * address most review comments did not address: https://github.com/shader-slang/slang/pull/4145#discussion_r1602256776 * revert incorrect code review suggestion * push changes for all code review suggestions 16 May 2024, 04:04:12 UTC
3b0de8b Add diagnostic to prevent defining unsized variables. (#4168) * Add diagnostic to prevent defining unsized static variables. * Fix tests. * Add more tests. * Fix to allow defining variables of link-time size. * update diagnostic message. * Fix tests. * Simplify code. 16 May 2024, 01:07:36 UTC
cc88530 Support combined textures for Metal target (#4169) 15 May 2024, 03:28:28 UTC
4edc72e Remove use of `G0` and `__target_intrinsic` in stdlib. (#4170) * Remove use of `G0` and `__target_intrinsic` in stdlib. * Fix. * Fix calling intrinsic in global scope. 15 May 2024, 01:01:31 UTC
d76bed6 Implement texture functions for Metal target (#4158) * Impl texture APIs for Metal target This commit is to implement texture functions for Metal target. The following functions are implemented and tested. - GetDimensions() - CalculateLevelOfDetail() - CalculateLevelOfDetailUnclamped() - Sample() - SampleBias() - SampleLevel() - SampleCmp() - SampleCmpLevelZero() - Gather() - SampleGrad() - Load() Metal has limited support for the texture functions compared to HLSL. - LOD is not supported for 1D texture, - Depth textures are limited to 2D, 2DArray, Cube and CubeArray textures. - "Offset" variants are limited to 2D, 2DArray, 2D-Depth, 2DArray-Depth and 3D textures. The functions that cannot be implemented for Metal should properly be handled by the capability system later. * Fix the failing test, multi-file.hlsl I am not sure why this change is needed. * Fix compile errors on macOS 2nd try * Remove a typo character to fix the compile error * Trivial clean up * Remove `as_type` where it was intended as static_cast * Use a simpler sytax for __intrinsic_asm * Trivial clean up * Remove TEST_AFTER_FIXING_CAPABILITY_PROBLEM after fixing normalize * Fix the failing test properly * Fix an incorrect setup of Depth-cube texture --------- Co-authored-by: Yong He <yonghe@outlook.com> 14 May 2024, 22:42:12 UTC
5ceb856 Fix CFG reversal logic for loops (#4162) Handles a corner case where the first block after the condition on the true-side is another condition. This would currently result in an invalid reverse graph, where the reverse version of the true-block is the merge point for two different branching insts (the reverse version of the loop as well as the second condition). This patch simply adds a blank block when constructing the reverse-loop (similar to critical edge breaking) so that each branch inst in the reversed loop has a unique merge block. 14 May 2024, 22:29:09 UTC
291b4cd Slang: Support UTF-8 with Byte Order Markers (#4135) Slang APIs are documented as taking UTF-8 encoded shader source, though it's not explicitly documented whether it is allowed to include a BOM (Byte Order Marker). This change adds support for UTF-8 BOM markers by virtue of disposing of BOM data. As a bonus, UTF-16 input which can cleanly decode to UTF-8 is now also accepted. Throwing out the BOM on input is done by leveraging existing functionality in "determineEncoding()", however a bug exists there for null-terminated single character input, where the null byte caused a heuristic to guess UTF-16, even though the null byte isn't part of the string. The bug in "determineEncoding" is fixed by only guessing when bytes >= 2 and not looking past the end of the buffer. The 'implicit-cast' test was mistakenly relying on the bug to pass, as its expected file was being read as UTF16 and cropped to zero length due to the bug. The expected output of implicit-cast is updated to pass with the bug fix in place. The decoding of UTF-16 to UTF-8 is done through an existing 'decode' method. This change fixes a bug in UTF16-LE 'decode' where it was decoded as if it were Big-Endian. Adds 3 small tests to ensure the compiler doesn't choke on source files in UTF-8 (with BOM), UTF16-LE, or UTF16-BE. Bonus: Fixes a bug in diagnostic reporting where hex values were incorrectly translated to text, leading to incorrect, possibly truncated strings. Fixes #4046 Co-authored-by: Yong He <yonghe@outlook.com> 14 May 2024, 18:05:58 UTC
9ab24cf Propagate warning settings on `Linkage` to IR passes. (#4156) 14 May 2024, 15:24:07 UTC
487ae03 Add LoadAligned and StoreAligned methods to ByteAddressBuffers (#4066) Fixes #4062 This change enables wide load/stores for byte-address-buffer backed resources, when the data is accessed at an offset that is aligned. **Goals** - Improve performance by issuing wider instructions instead of sequence of scalar instructions, for load and stores of byte-address buffers. - Reduce code-size and readability of the generated shaders. - Help naive users as well as ninja programmers, generate optimal code. **Non Goals** - Help with Structured buffers, or other resources. - Target compilation time improvements. **Key changes** Adds 2 new overloads for Load and Store operations on ByteAddress Buffers. 1. Load / Store with an extra alignment parameter ``` resource.Load<T>(offset, alignment); resource.Store<T>(offset, value, alignment); ``` 2. LoadAligned / StoreAligned with no extra parameter, with the same signature as orignial Load / Store. ``` resource.LoadAligned<T>(offset); resource.StoreAligned<T>(offset, value); ``` - This overload will implicitly identify the alignment value, from the base type T of the elementary unit of the resource. **Supported resources** 1. Vectors This can be upto 4 elements, i.e. float -- float4. 2. Arrays This does not have a limit on number of elements, but on a conservative estimate, we can limit to few hundreds. 3. Structures This is used to group a resource of a single type. ``` struct { float4 x; } ``` **Code updates** - Modified byte-address-ir legalize to handle struct, array and vector kinds of load or store access - Added custom hlsl stdlib functions to implement all the overloads for Load, Store etc. - Added C-like emitter, SPIR-V emitter for handling ByteAddressBuffers. - Added a new core stdlib function intrinsic to wrap around alignOf<T>(). - Added a new peephole optimization entry to identify the equivalent IntLiteral value from the alignOf<T>() inst. - Added tests to check explicit, and implicit aligned Load and Store operations. 14 May 2024, 06:57:57 UTC
9f23046 [gfx] specify resource view buffer range in bytes (#4149) * refactor gfx buffer range to use byte range * create buffer view with zero struct stride for ClearUnorderedAccessViewUint/Float * create buffer descriptors on demand * avoid copying gfx.dll --------- Co-authored-by: Yong He <yonghe@outlook.com> 13 May 2024, 22:39:49 UTC
04d3dd5 Update CONTRIBUTION.md Clarify which `slang.sln` file needs to be used for cmake workflow. 13 May 2024, 21:22:21 UTC
e005415 add missing Result to IRayTracingCommandEncoder::bindPipline (#4148) 11 May 2024, 22:11:44 UTC
86a9da1 Fix race-condition and visual artifacts issues (#4152) * Fix race-condition and visual artifacts issues In PerformanceProfiler::getProfiler() we return a static object for the profiler implementation, this is not thread-safe, so change it to thead_local. There is still some visual artifacts when using slang as the shading language. We don't know the root cause yet, but found out it's related to our loop inversion algorithm. So stage this feature for now, and turn it into an internal option and default off. We will re-enable it after more investigation on this optimization. File an new issue 4151 to track it. * Add '-loop-inversion' to the few tests 11 May 2024, 00:32:09 UTC
1dcd814 More Metal Intrinsics. (#4143) 10 May 2024, 16:41:31 UTC
926009a fix typo (#4144) Co-authored-by: Yong He <yonghe@outlook.com> 10 May 2024, 01:17:38 UTC
b446218 Add stdlib tests for `clamp` derivatives which also checks `max` and `min` derivatives (#4136) * Add stdlib tests for `clamp` derivatives which also checks `max` and `min` derivatives * Extend test 09 May 2024, 14:03:46 UTC
bf088c3 Metal: propagate and specialize address space. (#4137) 09 May 2024, 06:06:46 UTC
526430a Support `getAddress` of a single-element vector swizzle. (#4138) Fixes #4112. 09 May 2024, 06:05:14 UTC
8e86121 Support `[__ref]` attribute to make `this` pass by reference. (#4139) Fixes #4110. 09 May 2024, 03:52:36 UTC
448e21a `slangc` tool experience improvements. (#4140) * `slangc` tool experience improvements. Fixes #4123. Fixes #4127. * Update doc. 09 May 2024, 03:52:09 UTC
756ce3d Fix legalization of `kIROp_GetLegalizedSPIRVGlobalParamAddr`. (#4141) 09 May 2024, 03:23:38 UTC
708345d Fix crash in obfuscation (#4134) Slang crashes during obfuscation because of referencing the nullptr pointer. Add the checking. In addition, above situation happens when user provide an empty slang shader with '-obfuscate' option, we shouldn't do anything in that case. So add an early return in obfuscateModuleLocs if no IR code is actually generated. 08 May 2024, 19:25:11 UTC
6d917a0 Fix NonUniformResourceIndex legalization for SPIRV. (#4133) * Fix NonUniformResourceIndex legalization for SPIRV. * Update gh-4131.slang 08 May 2024, 17:41:52 UTC
7514d0b Add github action to ensure PRs are labeled. (#4130) * Add github action to ensure PRs are labeled. * Update. * Fix. * Fix * Fix * more Fix * more fix. * try. * fix * another try. --------- Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com> 08 May 2024, 17:22:40 UTC
93e5b71 [gfx] Cache mutable root shader object in Vulkan (#4119) * fix comment * add caching of mutable root shader objects in vulkan * Fix. --------- Co-authored-by: Yong He <yonghe@outlook.com> 08 May 2024, 17:19:08 UTC
4f2330d capture/replay: interface implementation 1 (#4122) * capture/replay: interface implementation 1 - Add global session, filesystem, and session capture interface classes: GlobalSessionCapture for IGlobalSession FileSystemCapture for ISlangFileSystemExt SessionCapture for ISession - Add environment variables to enable it The 2 variables are SLANG_CAPTURE_LAYER and SLANG_CAPTURE_LOG_LEVEL SLANG_CAPTURE_LAYER: In slang_createGlobalSession(), after the compiling/loading stdlib, we will check the capture environment variable, if it's set to 1, we will create a GlobalSessionCapture object and return to user code. SLANG_CAPTURE_LOG_LEVEL: This is to set the log level, user can choose the loglevel to debug. (We can remove this when the feature is fully implemented). - Update premake file and cmake file to add the capture/replay source folder * Fix Windows build error Fix windows build error by adding the "SLANG_MCALL" keyword. Change to use Slang::ComPtr for those captured object pointers to simplify the resource management. Use __func__ macro to print the function name in the log. 08 May 2024, 16:13:45 UTC
eb39708 Make sure pointer local vars have `AliasedPointer` decoration. (#4132) 08 May 2024, 04:50:41 UTC
997f040 Support Metal math functions (#4118) * Support Metal math functions Closes #4024 Note that Metal document says Metal doesn't support "double" type; "Metal does not support the double, long long, unsigned long long, and long double data types." According to Metal document, math functions are not defined for integer types. That leaves only two types to test: half and float. As a code clean up, __floatCast is replaced with __realCast. But I had to add a new signature that can convert from integer to float. Some of GLSL functions are moved to hlsl.meta.slang. For those functions, there isn't builtin functions for HLSL but there are for GLSL and Metal. "nextafter(T,T)" is currently not working because it requires Metal version 3.1 and we invoke metal compiler with a profile version lower than 3.1. * Changes based on review comments. 07 May 2024, 15:27:27 UTC
1b3a428 Support groupshared variables for Metal. (#4116) 07 May 2024, 02:21:03 UTC
618428a Delete `wrap-global-context` pass. (#4114) * Delete `wrap-global-context` pass. The pass was added for the metal backend without realizing that the existing `explicit-global-context` does 99% of the job. Instead of duplicating the logic in a different pass for metal, we extend same explicit-global-context pass to work for metal. * Fix build. 06 May 2024, 21:53:27 UTC
2220d26 Fix macos release script. (#4106) 04 May 2024, 04:56:59 UTC
7db3986 Fix mistake in WaveMatch intrinsic. (#4105) 04 May 2024, 04:56:39 UTC
59903ef Add host shared library target. (#4098) * Add host shared library target. * Attempt fix. * Fix warnings. * try fix. * Fix test. * Fix. 04 May 2024, 01:02:31 UTC
54153a3 Don't bottleneck Wave intrinsics through `WaveMask*` for spirv. (#4099) * Don't bottleneck Wave intrinsics through `WaveMask*` for spirv. * Fix. 03 May 2024, 22:23:23 UTC
47a917c Fix `Ptr::__subscript` to accept any integer index. (#4100) * Fix `Ptr::__subscript` to accept any integer index. * Fix `Ptr::__subscript` to allow 64bit indices. 03 May 2024, 19:18:47 UTC
13250ff Utilize vector operations over scalar if possible (#4092) * Utilize vector operations over scalar if possible Closes #4085 * Fix for the failing CI [ForceUnroll] is removed because it changed the emitted SPIR-V code a little differently for half-conversion.slang. SPIR-V code style is changed to a more preferred style, from "OpXX $$T result $x" to "result:$$T = OpXX $x" 03 May 2024, 17:06:39 UTC
1863fe1 Support generic constraints that are dependent on another generic param. (#4091) 02 May 2024, 23:48:27 UTC
7ef980f Fix unzipping logic for inout non-diff parameters and adjust tests (#4090) * Fix unzipping logic for inout non-diff parameters and adjust tests + Removed `-g0` from `struct-this-parameter.slang` test. Works correctly with the new unzipping logic. + Removed `-g0` from `was/warped-sampling-1d.slang` test. Works correctly with DX12 & CS_5_1. CS_5_0 appears to run into an FXC compiler bug with detecting infinite loops where there don't appear to be any. * Update slang-ir-autodiff-unzip.h * Update warped-sampling-1d.slang 02 May 2024, 23:46:59 UTC
6b30957 Slang: update pointer related documentation (#4088) Slang does have some support for pointers. Remove an outdated comment stating the contratry, and update the section that describes pointer support to also list some relevant limitations. Fixes #3970 Co-authored-by: Yong He <yonghe@outlook.com> 02 May 2024, 23:01:43 UTC
c763750 Handle case where types can be used as their own `Differential` type. (#4057) * Avoid synthesis for when types can be used as their own differenial + Add test * Add missing files.. * Fix issue with method synthesis for self-differential types + Add a generic test * Fix * Fix issue with out-of-date type resolution cache. Witness tables created during the conformance checking phase not being taken into account during the decl type resolution phase because the epoch is not updated after conformance checking. This leads to certain complex associated-type lookup chains (such as the one in tests/compute/assoctype-nested-lookup) not resolving properly and causing errors. * Delete self-differential-type-synthesis-extension.slang * Quick fix to repopulate stdlib cache for deferred stdlib loading * Update slang-check-decl.cpp 02 May 2024, 23:01:21 UTC
e5d49cf Allow multiple _AttributeTargets for attribute declaration (#4087) The syntax like: [__AttributeUsage(_AttributeTargets.Var)] [__AttributeUsage(_AttributeTargets.Param)] struct DefaultValueAttribute { int iParam; }; is allowed. For user-defined attribute, we can specify more attribute targets on the attribute declaration. So one attribute can be used in more than one situations. 02 May 2024, 20:05:18 UTC
f7d54af Fix fmod behavior targetting GLSL and SPIR-V (#4080) * Fix fmod behavior targetting GLSL and SPIR-V The default implementation of fmod was doing "Modulo" operation when "fmod" in HLSL should do "remainder" operation. * Fix a mistake in `fmod` GLSL target When using __intrinsic_asm, the "if" logic wasn't emitted. "__intrinsic_asm" had to be called from a new function and `fmod` had to call it. Alternatively, I am using `operator?()` to workaround. A similar modification is made to `roundEven()` hoping for a better performance. 02 May 2024, 18:56:13 UTC
679a457 Implement SPIR-V target for GLSL functions (#4083) Fixes #4051 This commit implements SPIR-V target for GLSL functions. It also fixes a few problesm of GLSL targetting implemention too. 02 May 2024, 16:59:45 UTC
d53d793 Fix reflection-test issue (#4082) (#4084) The reflection test doesn't print the user attributes decorating for the variables, only types. Therefore, add the print for user attributes of variables. 02 May 2024, 16:22:44 UTC
b490414 Delete out-of-date assert. (#4079) 02 May 2024, 04:38:24 UTC
436b22f Fix/replace target intrinsic to target switch part 2 (#4058) * Fix texture capabilities * Remove more __target_intrinsic and fix capability for texture Fixes #3906 With this commit, following functions will use __target_switch: - abs - asdouble - clamp - min - max - EvaluateAttributeSnapped - frexp - log10 - modf - __glsl_textureXXX For an unknown reason, I couldn't get "min(int,int)" working with __target_switch. It causes a test failure in Falcore unit test. --------- Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com> 02 May 2024, 03:26:28 UTC
08de73a Copy default target's optionSet to code-gen target's optionSet (#4073) In current implementation, the some options will be to added to the target that is only specified by command line "-target". But if user specifies the target by just using slang API, e.g. 'spAddCodeGenTarget', those options will be missed. To fix the problem, we copy the default target's options to the code-gen target's option set. The default target will only be useful when there is no target specified in the command line. 02 May 2024, 01:29:39 UTC
9043bc5 Fix compile failures when using debug symbol. (#4069) * Fix compile failures when using debug symbol. * Various fixes. * Fix intrinsic. * Fix test. 02 May 2024, 00:30:55 UTC
0bb826f SPIRV: Fix performance issue when handling large arrays. (#4064) * SPIRV: Fix performance issue when handling large arrays. * Add test for packing. * Fix clang. 01 May 2024, 23:44:22 UTC
4533c82 SPIRV: Fix storage class for unwrapped pointers (#4068) In SPIRV legalization, a struct wrapper is created around push constants but is not itself legalized. Putting the struct type into the work list causes the storage access of the push constant pointer to be PhysicalStorageBuffer as expected, instead of Function scope that was produced without the added struct legalization. Adds a SPIRV test that exercises the fix. Fixes #3946 Co-authored-by: Yong He <yonghe@outlook.com> 01 May 2024, 23:16:38 UTC
853987d Add ParamDecl as the attribute target (#4067) Currently we only allow variable, struct, and function as the target for the user-defined attribute, this change adds the function parameter to the target as well. 01 May 2024, 17:46:21 UTC
ca62ec2 Adds functionality to dump IR to stdout (#4065) Adds a member dump() to IRInst that can writes the immediate value or IR inst value to stdout to help with debugging 01 May 2024, 07:06:59 UTC
2abd5bd Avoid classifying methods with `[numthreads]` as entry points for CUDA-related targets (#4063) 01 May 2024, 01:03:21 UTC
52b9123 Added diagnostics & built-in type lowering for `[CUDAKernel]` functions (#4042) * Added diagnostics & built-in type lowering for `[CUDAKernel]` functions This PR adds - Diagnostics for non-void return from a cuda kernel entry point - Diagnostics for using differentiable types in a differentiable cuda kernel entry point - Logic for converting built-in types (float3, float3x3, etc..) to portable struct types and unpacks the parameter back into a built-in type on the CUDA side. This is because built-in types have different implementations in CUDA & CPP targets, which causes signature mis-match when linking. * Fix error codes * Add ability to lower structs and arrays that contain built-in types. + Added tests + Fix issue where the host-side was not marshalling data to lowered types. * Update slang-ir-pytorch-cpp-binding.cpp --------- Co-authored-by: Yong He <yonghe@outlook.com> 30 April 2024, 20:05:33 UTC
70111da Generate vectorized version of byteaddress load/store methods (#4036) Fixes #3533 - Add logic to perform aligned memory operations for loading from and storing to composite resources, like vectors within the ByteAddress legalize pass. - Checks Added a new test for byte address with/without alignment. --------- Co-authored-by: Yong He <yonghe@outlook.com> 30 April 2024, 19:20:16 UTC
95ca2aa Change stdlib to not depend on short-circuit (#4056) Do not use "&&" to implement the intrinsic kIROp_And, instead define a 'and' function in stdlib. So it will be up to us to determine whether we want to use 'short-circuit' behavior in stdlib. 30 April 2024, 18:23:11 UTC
492f56e Add option -disable-short-circuit (#4054) Add option -disable-short-circuit to disable short circuit for logic operators && and ||. Also, disable the short circuit by default in the stdlib. 30 April 2024, 17:47:10 UTC
f1221b8 Metal: Vertex/Fragment builtin and layouts. (#4044) * Metal: Vertex/Fragment builtin and layouts. * Fix. * Fix test. * Emit user semantic on vertex/fragment attributes. 30 April 2024, 16:57:54 UTC
019d68f Replace __target_intrinsics and __specialize_for_target, part 1 (#4050) * Replace __target_intrinsics and __specialize_for_target Partially resolves #3906 Most but not all __target_intrinsics are replaced with __target_switch. All __specialize_for_target are replaced with __target_switch. This change is mostly processed by a temporary c++ program mechanically. Because the change is already too big, the remaining __target_intrinsics will be replaced later in another commit. * Fix indentations * Add diff.meta.slang * Revert the change in __sizeOf<>(). "$G0" doesn't seem to work. It needs to be addressed later. * Revert more functions that use `$G0` keyword 29 April 2024, 21:14:05 UTC
1a40819 Do not mangle the name of identifiers when __extern_cpp is added (#4052) Do not mange the name of identifiers decorated by "__extern_cpp". For a slang files that are included by the library module and entry point module, slang could generated two different mangled names for the same functions, because the function with a struct parameter will make the mangled function name contains the file name. Therefore, we allow using "__extern_cpp" on such struct, such that no file name is associated in the mangled name. 29 April 2024, 17:15:11 UTC
30b82ab Add variable pointers to render-test-vk and a related failing test-case (#4041) * add variable pointers to render-test-vk and failing (but ignored with workarounds) test-case 29 April 2024, 02:34:02 UTC
2b87c00 Fix invoke resolution when dealing with overloded type expressions (#4043) 27 April 2024, 06:27:28 UTC
e91bd3b WIP: Force Inline If RefType (#4005) * Force Inline if reftype Fixes #3997. If we are using a refType, we now ForceInline. remarks: 1. Modifications were made in slang-ir-glsl-legalize to change how we translate GlobalParam proxy's into GlobalParam. a. We now handle the senario where a globalParam is used in multiple disjoint blocks (like 2 different functions). * try to figure out why CI fails but local works try to inline DispatchMesh, works locally, may fail on CI(?) * try another fix * add task tests + don't allow semi-early task-shader inline Task shader uses DispatchMesh which is a very big 'hack' where we check for the function name and modify the callees in very large ways. This function does inline, but it cannot inline early due to future mangling that this operation requires todo. This is reflected with the `[noRefInline]` modifier. It is a modifier so users may stop mandatory inlines with `__ref` parameter. 26 April 2024, 05:27:30 UTC
bc7231b Fix unpackUnorm4x8 and unpackSnorm4x8 (#4033) Fixes #4031 Each component of unpackU/Snorm4x8 had to be masked for 8bits. 25 April 2024, 23:14:08 UTC
ed06811 Keep const-ness in generic functions (#4028) * Keep const-ness in generic functions Closes #3834 The issue was that "const" variables inside of generic functions became non-const variables. This issue prevented some of GLSL texture functions from being called inside of generic functions. When `propagateConstExpr()` iterates the global functions, the generic functions had to be handled little differently. This commit allows the iteration to happen for the generic functions. * Adding an explantion of the test as a comment 25 April 2024, 16:02:13 UTC
366a947 Support derivative functions in compute & capabilities adjustments (#4014) * Support derivative functions in compute & capabilities adjustments fixes #4000 PR implements derivative functions in compute shaders properly so we have the functionality for SPIR-V & GLSL. Tests reflect fragment and compute paths. PR also adjusts capabilities to correct wrong SPRI-V target capabilities for when using textures. Remarks: 1. __requireComputeDerivative(); is a intrinsic_op and not modifier since inlining will destroy the modifier. 2. Derivative mode is tied to an entry point decoration `[DerivativeGroupQuad]`/`[DerivativeGroupLinear]` or GLSL syntax ``derivative_group_linearNV`. Default is to set the mode to `[DerivativeGroupQuad]` * remove -emit-spirv-directly * fixes 1. fix minor issue fwidth change where I returned the wrong type 2. fix issue where glslang{glsl->spirv} is wrong, so we don't run that test and just run the glsl test & direct spir-v test for intrinsic-texture.slang * adjust as per review and refine code 1. add test to ensure multi-diverging-in-logic entry points work -- 2 functions which may cause computeDerivatives + 1 that uses, 1 that does not. 2. naming 3. use entry point ref graph for c-like-targets 4. reordered some code to util's and removed `static linline` since that was just for ease of coding on my end (should not have been pushed). * Grammer * split up source file + issolate GLSL emit path change. --------- Co-authored-by: Yong He <yonghe@outlook.com> 25 April 2024, 13:18:32 UTC
52dcb5b Updating CONTRIBUTION guide to use CMake (#4017) Releated to #3703 Removing the build instruction with Premake and replacing it with an instruction with CMake. It is because we are going to move over to CMake anytime soon. Bumping the required CMake version to 3.25.0. When CMakePresets.json has "version:6", it requires CMake version to be 3.25 or above. See the URL below for more information, https://cmake.org/cmake/help/latest/release/3.25.html CMakeLists.txt copies the prebuilt binary files from external/slang-binaries/bin/windows-x64 CMakeLists.txt was copying "slang-llvm.dll" to build/Release/lib directory when it should have been build/Release/bin. It made slang-test to ignore all FILECHECK tests. This is fixed. Co-authored-by: Yong He <yonghe@outlook.com> 25 April 2024, 01:35:46 UTC
941961e Prevent pointer validation for zero-size arrays (#4021) 24 April 2024, 23:50:50 UTC
d3ed08e Parameter layout and reflection for Metal bindings. (#4022) 24 April 2024, 23:23:35 UTC
fc4c242 Fix macos CI and clang warnings. (#4019) * Fix macos CI. * Fix. * Fix. * Fix. * Fix clang warnings. * Fix more warnings. 24 April 2024, 22:51:43 UTC
211b2ff Silent compiler warning about missing override keywords (#4018) Adding "override" keywords for member functions whereever they need. The compiler warning was visible on CI build but not visible on local visual studio build. 24 April 2024, 15:06:50 UTC
97631e9 Avoid DXC warnings for missing bitwise op parantheses (#4004) Resolves #3980 Based on the operator precedence, Slang may omits the parentheses if they are not needed. DXC prints warnings for such cases and some applications may treat the warnings as errors. This commit emits parentheses to avoid the DXC warning even when they are not needed. 24 April 2024, 14:18:21 UTC
c6b9a91 Do not diagnose error when a symbols is defined as 'extern' and 'export' (#4010) Fix the issue (#3999). For a function is defined as extern and export at the same time, don't report error, we can use the 'export' function to overload the 'extern' function. 24 April 2024, 01:39:15 UTC
f1de181 Switch to direct-to-spirv backend as default. (#4002) * Switch to direct-to-spirv backend as default. * Fix slang-test. * Fix. * Fix. 23 April 2024, 19:14:21 UTC
0d92068 Fix a bug in the forward derivative of cross product (#4006) * Fix a bug in fwd-diff for cross product * Also add a test for the reverse-mode AD --------- Co-authored-by: Yong He <yonghe@outlook.com> 23 April 2024, 17:19:30 UTC
9f892c9 use memberExpr instead of varExpr (#4008) 23 April 2024, 16:53:32 UTC
484c1e6 ForceInline ByteAddressBuffer operations in stdlib (#4003) * ForceInline ByteAddressBuffer operations in stdlib * fixup 23 April 2024, 02:14:35 UTC
22fbca5 create empty vulkan framebuffer with max dimensions (#3996) Co-authored-by: Yong He <yonghe@outlook.com> 22 April 2024, 16:19:37 UTC
923ef7a bit_cast & reinterpret warning if src->dst type not equally sized. (#3988) * bit_cast & reinterpret warning if src->dst type not equally sized. bit_cast & reinterpret warning if src->dst type not equally sized. --------- Co-authored-by: Yong He <yonghe@outlook.com> 22 April 2024, 14:07:06 UTC
c5b855d Update the dependency file (#3994) Update the dependency file to use the latest release version for slang-llvm and slang-glslang where we added a new Linux release to support the older version of Glibc-2.27. Fix a type in github.sh for the glibc compatible option input for premake5.lua. 22 April 2024, 04:27:19 UTC
51dc26e Flag to prevent packing of cbuffer elements in HLSL backend. (#3993) 21 April 2024, 23:51:18 UTC
8362c2d Create a new release build for linux_x64. (#3989) 20 April 2024, 04:27:10 UTC
beae3a9 Add metal downstream compiler + metallib target. (#3990) * Add metal downstream compiler + metallib target. * Add more comments. * Add missing override. 20 April 2024, 04:02:32 UTC
f9bcad3 Initial pass to add capability declarations to stdlib intrinsics. (#3912) 20 April 2024, 03:18:40 UTC
2da28c5 Support arithmetics on generic arguments (#3968) Resovles an issue #3935 Slang had to fold the generic arguments after specialization. 19 April 2024, 23:43:21 UTC
e0aa53f allow preludes in include folder (#3976) Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com> Co-authored-by: Yong He <yonghe@outlook.com> 19 April 2024, 21:27:48 UTC
adbaf8f add `-ignore-capabilities` flag (#3984) `-ignore-capabilities` flag allows ignoring capability incompatibilities/discontinuity errors/warnings. We still process capabilities (needed for stdlib). Added to capability tests to ensure everything is working as intended. More will be added in the full stdlib capabilities implementation. 19 April 2024, 20:39:05 UTC
7c162eb Enable NonUniformResourceIndex support for glsl, hlsl and spirv (#3899) Fixes #387676* ForceInline SampleLevel to allow decorations to apply * explictly add all the SPIRVAsmOperand Insts in non-differentiable list, which might get inadvertently processed when these functions are inlined into the main shader * Support NonUniformResourceIndex for SPIR-V target Fixes #3876 * add a new IR instruction for NonUniformResourceIndex * slang ir emitter for nonuniform resource index * update the hlsl meta slang * Add test cases for NonUniformResourceIndex access for buffers and textures, with/without cast, nested access etc. * add default c-like emitter for nonuniformresourceinfo * added hlsl emitter * added glsl emitter * requisites for spirv enabling - new decorator for nonuniformresourceindex - emitter for nonuniformresourceindex signature change * add hasResourceType checker * add rwStructBuffType in resourcetype checker * add a case for nonuniformres in emitDecorations * DO NOT COMMIT: This change adds special handling for RWStructBuf within the isResourceType function, if it is a pointer to this resource, return true to make it work with nonuniformres test * spirv emitter for decorations - update the emitLocalInst to perform decorations at the end * added main spirv emitter code * slang emit spirv bugfix * hacky way of supporting Call Inst * move code to cleanup nonuniform inst into helper function * remove stale codefrom test * add spirv decoration for nonuniform * update test to remove global variables * update coherent-2 test * update comment for special handling * update the spirv legalize to handle nested nonuniforms improved logic that handles call ops, rwstructbuf, nested nonuniforms etc. * update nonuniform-array-of-tex test * missed removing nonuniform inst causing duplicate decorations * add glsl and hlsl variants of nonuniform tests * repurpose the hasResource function into something specific for nonuniform inst decoration helper * clean up comments and code around spirv-legalization to emit nonuniform inst by recursively looking into the inst * use the helper canDecorateNonUniformInst to convert `nonUniformResourceInfo` inst to decoration * converted compute/unbounded-array-of-array cross compile test into a simple check test * update contains Resource helper function to be more generic * clean up the case for opcall handling with nonuniform resource inst * update ptr to struct buffer check to be more explicit and rename the function to check for ptr to resource type * update comments and fix the test for coherent * fix typos * update logic on spirv legalize to delete dead instructions - for some reason this doesn't automatically happen * add comments to declarations * add NonuniformResourceIndex to the non-differential inst list 19 April 2024, 16:12:56 UTC
a3a5e7e Metal: rewrite global variables as explicit context. (#3981) * Metal: rewrite global variables as explicit context. * Small tweaks. 19 April 2024, 06:01:45 UTC
a2b9e37 ForceInline SampleLevel to allow decorations to be applied (#3977) Fixes #3969 NonUniformResourceInfo instruction is applied as a Decoration on the backing resource. With the following shader, this is applied to the Function Call. res.rgb *= g_bindless_Texture2D [NonUniformResourceIndex (val.x)].SampleLevel(g_Sampler, v, 0.0).rgb; as shown below: {145371} let %1826 : Int = nonUniformResourceIndex(%1789) {177146} let %1828 : Vec(Float, 4 : Int) = call %SampleLevel(%1826, %sampler, %1827, 0 : Float) This patch ForceInlines SampleLevel intrinsic function call so that the Decoration is correctly applied on the resource. 18 April 2024, 21:12:04 UTC
d3fd747 Implement if(let ...) syntax (#3673) (#3958) 18 April 2024, 06:23:15 UTC
5dd27a2 Support combined texture sampler when targeting HLSL. (#3963) * Support combined texture sampler when targeting HLSL. * Fix glsl intrinsics. * Update source/slang/slang-ir-lower-combined-texture-sampler.cpp Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com> * Update source/slang/slang-ir-lower-combined-texture-sampler.cpp Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com> * Update source/slang/slang-ir-lower-combined-texture-sampler.cpp Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com> * Fix., * Enhance test. * Remove unused field. * Fix indentation --------- Co-authored-by: ArielG-NV <159081215+ArielG-NV@users.noreply.github.com> 18 April 2024, 05:14:34 UTC
355a8d8 commit to partially fix #3931 (#3972) 18 April 2024, 04:41:00 UTC
2c66cc7 Add skeleton for metal backend. (#3971) 18 April 2024, 04:32:28 UTC
4b3f554 Force Inline all the InterlockedAdd functions in stdlib (#3965) This change forcibly inlines the InterlockedAdd functions when using byteAddress buffer. The IR generated when using nonUniformResourceInst on RWByteAddressBuffer: buffer[NonUniformResourceIndex(uint(0))].InterlockedAdd(0, 1); follows the sequence of a call into an index lookup that is wrapped by a nonuniformResourceIndex: %ld = nonUniformResourceIndex(0) Call RWStructBufferInterlockedAdd(%ld, 0, 1) This prevents NonUniformResource decoration of the buffer because it is wrapped by the function call to InterlockedAdd, that further expands to: %gep = getElement(%buffer, 0) SpirvAsmInst(..., rwStructuredBufferGEP(%gep, 0), ...) By Force-Inlining the atomic functions, the buffer / resource is made visible to the nonUniformResourceIndex inst, allowing the decoration. Identified while debugging tests/spirv/coherent-2.slang 17 April 2024, 06:59:41 UTC
6731358 Fix Slang documentation typos (#3961) 17 April 2024, 02:02:45 UTC
282da4a Fix for unscoped enums circular reference causing an error, #3959 (#3962) 17 April 2024, 02:01:06 UTC
d5d39dd Init expressions for struct fields support, #3738 (#3907) * Init expressions for struct members Following commit handles init expressions of struct's. The general implementation follows C++ init expression rules for classes & inherited classes. The logic was implemented after type resolution (`SemanticsDeclAttributesVisitor`): 1. Create a default constructor if missing. 2. Check all member variables (`this` and `super`) for if a member has an init expression, continue to *3* if found. 3. For each constructor, insert a member variable's init expression at the beginning of a constructor. This is to follow how C++ does construction of objects. Some important notes about implementation: * We must handle the scenario that there is inheritance. To handle the inheritance information processing `findLevelsOfInheritance` was created. * If a user manually sets overload rank's of constructor expression's we have no way to assume new default constructor overload ranks. * address feedback - moved all scope bound variables into if statment initializers - added indent - changed logic for overloadRank to be centered around positive numbers rather than negative * Inheritance fixes universally & for struct field init 1. reimplemented struct field logic 2. implemented inheritance through calling a "super->init()" inisde a constructor for each "this". 3. implemented support for multi level inheritance (4+) and accessing members without a crash. * add a way to ignore Forward declared constructors. * a test and fix for a falcor failiure the following case was not handled: creating an default Ctor due to a non L-Value struct field. Having an empty Ctor causes a warning. * remove texture/sampler from test since it will break glsl * get inheritance info using existing lookup logic modified Facet lookups to store relative depth rather than arbitrary ::Self or' ::Direct for inheritance (which was 'wong' since depth 2 is not Direct, but was considered a Direct inheritance) * cleanup unused * cleanup unused functions and whitespace * fix compile warning * clean up, reorder, addressed language server fail changed logic to safeguard bad code --> no longer breaks language server if code is incomplete. remove the "semi-ordering" logic because caused a crash (and this code does nothing functionally, just thought it would be nice to add if '0 cost'). Remove rank setting for constructors, in place use an addition to the overload system: "this" expressions have calling priority over "super" expressions. * undo all inheritance depth checks & code added to the inheritance checking algorithm Reorder default ctor creation and auto-generation of constructor body. * Handle same struct types during overload resolution Changed overload resolution logic to properly handle same struct types; added test to check for multi-param same type function overload. * remove unused ast object Used unused object in an incorrect way. This caused the compiler to not flag a warning. * extension support for default constructors specialization is not supported with default constructors yet. * fix bugs Fix bug in override/overload logic with type comparisons. used wrong type for ctor list construction Specialization has not been added yet * disallow default ctor inside extension * adjust comment, add new tests * add explicit types to invoke, use faster default ctor lookup. * adjust syntax & naming as recomended 16 April 2024, 17:06:23 UTC
3192f34 [GFX] Fix d3d12 buffer view creation logic for StructuredBuffers. (#3954) 16 April 2024, 06:28:28 UTC
030d7f4 Support 64bit HLSL atomic functions (#3957) Resolves #3951 This adds a few atomic functions for SM6.6. The spec can be found from here: https://microsoft.github.io/DirectX-Specs/d3d/HLSL_SM_6_6_Int64_and_Float_Atomics.html The new functions are: void InterlockedAdd(inout XXX dest, in int64_t value, out int64_t original_value); void InterlockedAdd(inout XXX dest, in uint64_t value, out uint64_t original_value); void InterlockedAnd(inout XXX dest, in uint64_t value, out uint64_t original_value); void InterlockedOr(inout XXX dest, in uint64_t value, out uint64_t original_value); void InterlockedXor(inout XXX dest, in uint64_t value, out uint64_t original_value); void InterlockedMin(inout XXX dest, in int64_t value, out int64_t original_value); void InterlockedMin(inout XXX dest, in uint64_t value, out uint64_t original_value); void InterlockedMax(inout XXX dest, in int64_t value, out int64_t original_value); void InterlockedMax(inout XXX dest, in uint64_t value, out uint64_t original_value); void InterlockedExchange(inout XXX dest, in float value, out float original_value); void InterlockedExchange(inout XXX dest, in int64_t value, out int64_t original_value); void InterlockedExchange(inout XXX dest, in uint64_t value, out uint64_t original_value); void InterlockedCompareStore(inout XXX dest, in int64_t compare_value, in int64_t value); void InterlockedCompareStore(inout XXX dest, in uint64_t compare_value, in uint64_t value); void InterlockedCompareStoreFloatBitwise(inout XXX dest, in float compare_value, in float value); void InterlockedCompareExchange(inout XXX dest, in int64_t compare_value, in int64_t value, out int64_t original_value); void InterlockedCompareExchange(inout XXX dest, in uint64_t compare_value, in uint64_t value, out uint64_t original_value); void InterlockedCompareExchangeFloatBitwise(inout XXX dest, in float compare_value, in float value, out float original_value); void RWByteAddressBuffer::InterlockedAnd64(in uint dest_offset, in uint64_t value, out uint64_t original_value); void RWByteAddressBuffer::InterlockedOr64(in uint dest_offset, in uint64_t value, out uint64_t original_value); void RWByteAddressBuffer::InterlockedXor64(in uint dest_offset, in uint64_t value, out uint64_t original_value); void RWByteAddressBuffer::InterlockedMin64(in uint dest_offset, in int64_t value, out int64_t original_value); void RWByteAddressBuffer::InterlockedMin64(in uint dest_offset, in uint64_t value, out uint64_t original_value); void RWByteAddressBuffer::InterlockedMax64(in uint dest_offset, in int64_t value, out int64_t original_value); void RWByteAddressBuffer::InterlockedMax64(in uint dest_offset, in uint64_t value, out uint64_t original_value); void RWByteAddressBuffer::InterlockedExchangeFloat(in uint dest_offset, in float value, out float original_value); void RWByteAddressBuffer::InterlockedExchange64(in uint dest_offset, in int64_t value, out int64_t original_value); void RWByteAddressBuffer::InterlockedExchange64(in uint dest_offset, in uint64_t value, out uint64_t original_value); void RWByteAddressBuffer::InterlockedCompareStore64(in uint dest_offset, in int64_t compare_value, in int64_t value); void RWByteAddressBuffer::InterlockedCompareStore64(in uint dest_offset, in uint64_t compare_value, in uint64_t value); void RWByteAddressBuffer::InterlockedCompareStoreFloatBitwise(in uint dest_offset, in float compare_value, in float value); void RWByteAddressBuffer::InterlockedCompareExchangeFloatBitwise(in uint dest_offset, in float compare_value, in float value, out float original_value); 16 April 2024, 02:47:23 UTC
back to top