Revision history - None - origin: https://github.com/shader-slang/slang

visit type:

Revision	Author	Date	Message	Commit Date
10386fa	Yong He	03 September 2021, 08:39:17 UTC	Fix memory error.	03 September 2021, 08:39:17 UTC
265f403	Yong He	03 September 2021, 07:55:25 UTC	Fix crash: dynamic dispatch of generic interface method.	03 September 2021, 07:55:25 UTC
cf7ddda	Theresa Foley	02 September 2021, 23:10:53 UTC	Two small fixes. (#1928) * Fix mangling logic for the case where a symbol name contains characters that aren't permitted (this usually occurs when a module name consists of the actual path to the module). There were multiple early-out `if` cases that accidentally fell through to the fallback path, so that symbol names would end up being excessively long. * Fix type conversion cost lookup cache, by allowing single-element vectors (e.g., `vector<float,1>`) and single row/column matrix types to be distinguished from types of lower rank. Previously, `float` and `float1` and `float1x1` would share a single cache entry, even though each (currently) has very different conversion rules.	02 September 2021, 23:10:53 UTC
caf4642	Yong He	01 September 2021, 23:00:06 UTC	Bug fix for createDynamicObject (#1927) Co-authored-by: Yong He <yhe@nvidia.com>	01 September 2021, 23:00:06 UTC
09e32c1	jsmall-nvidia	01 September 2021, 22:38:43 UTC	Improving glslang -O1 passes (#1926) * #include an absolute path didn't work - because paths were taken to always be relative. * Improve passes used for -O1 option with glslang. * Set sourceManager when serializing container.	01 September 2021, 22:38:43 UTC
b2ad8e9	Yong He	26 August 2021, 17:30:35 UTC	Add API to control interface specialization. (#1925)	26 August 2021, 17:30:35 UTC
33f7e15	Yong He	25 August 2021, 17:27:22 UTC	Add `createDynamicObject` stdlib function. (#1923) This function takes a user provided `typeID` and arbitrary typed value, and turns them into an existential value whose `witnessTableID` is `typeID` and whose `anyValue` is the user provided value. This allows the users to pack the runtime type id info in arbitrary way.	25 August 2021, 17:27:22 UTC
3b0b920	Nathan V. Morrical	23 August 2021, 17:31:00 UTC	Bug fix for optix SBT access (#1922) * optix SBT record data can now be accessed using uniform parameters on ray tracing entry points * Update slang-emit.cpp * fixing a bug where SBT instruction was missing a location at which to insert. Switching back to emitFieldExtract and accounting for changes in instruction emission location	23 August 2021, 17:31:00 UTC
858c7c5	Yong He	17 August 2021, 16:39:02 UTC	Add GLSL450 intrinsics to SPIRV direct emit. (#1921) * Add GLSL450 intrinsics to SPIRV direct emit. * Fix. * Fix compiler error. * Fix. * Fix compiler error. * Make direct-spirv tests actually run.	17 August 2021, 16:39:02 UTC
6406523	Yong He	12 August 2021, 20:14:15 UTC	Further implementation of SPIRV direct emit. (#1920) * Further implementation of SPIRV direct emit. This change implements: - Struct, Vector, Matrix and Unsized Array types. - Basic arithmetic opcodes, vector construct, swizzle etc. - getElementPtr, getElement, fieldAddress, extractField. - SPIRV target intrinsics with SPIRV asm code in stdlib. - RWStructuredBuffer and StructuredBuffer. - Pointer storage class propagation. - Control flow. * Fix.	12 August 2021, 20:14:15 UTC
389d21d	Theresa Foley	11 August 2021, 19:24:43 UTC	Handle unexpected character in mangled names (#1919) This change is related to a case where a user saw a `/` character printed as part of a structure field name in output HLSL (which obviously broke downstream compilation). The root cause in that case was that a module that was `import`ed was found via a file system directory path, and thus the module name that the Slang compiler formed included a `/`. Due to other factors the structure field was emitted based on its mangled name (missing name hint), and that meant the bare `/` escaped into the output. That situation implies some other things maybe are questionable: * Could we have avoided a `/` ending up in the module name to begin with? Maybe, but we kind of want to support non-ASCII characters in names in the long run. * Could we have fixed whatever was causing the name hint to be dropped? Maybe, but we don't ever want name hints to be required for valid compilation (hence why they are "hints"). Even if we investigate these other issues, it is important that the name mangling scheme only ever produce output that uses a constrained subset of ASCII that we expect most compilation targets can support (GLSL being an annoying outlier because of its rules around `_`). This change adds a path where when mangling a `Name` as part of a mangled symbol name we check for the presence of bytes that aren't allowed and then produce an escaped form instead if needed. Note that the code is still byte-oriented for now aothough in the long-run it might want to be oriented around Unicode code points or extended grapheme clusters.	11 August 2021, 19:24:43 UTC
4323a15	Theresa Foley	11 August 2021, 18:00:16 UTC	Fix a few issues around opaque types as outputs (#1918) * Fix a few issues around opaque types as outputs Slang and HLSL support opaque types (textures, buffers, samplers, etc.) as members of `struct`s, mutable local variables, function results, and `out`/`inout` parameters. GLSL and SPIR-V do not. In order to translate Slang code over to GLSL/SPIR-V we use a variety of passes that seek to eliminate all of the above use cases and produce code that only uses opaque types in the limited ways that GLSL/SPIR-V allow. This change relates to the passes that deal with function results and `out`/`inout` parameters. There are two basic changes here: 1. The `specializeResourceOutputs` pass was only dealing with resource (texture/buffer) types. This change updates it to process sampler types as well. 2. The sequencing of the passes made it possible that an opaque-typed local variable might be left around after `specializeResourceOutputs`, which would mean the code is still invalid for GLSL/SPIR-V. This change adds an additional SSA-formation pass which would eliminate any opaque-type local variables whose lifetimes were made simple enough by the optimizations. Together these changes fix a problem-case user shader that was failing to compile for Vulkan. * Update slang-emit.cpp Fix typo 'reuslt' * Update slang-emit.cpp Comment change to re-trigger CI build. Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>	11 August 2021, 18:00:16 UTC
08e36dd	Nathan V. Morrical	10 August 2021, 19:25:25 UTC	Enable reading OptiX SBT records via uniform parameters on ray tracing entry points (#1917) * optix SBT record data can now be accessed using uniform parameters on ray tracing entry points * Update slang-emit.cpp	10 August 2021, 19:25:25 UTC
ebf0e52	jsmall-nvidia	29 July 2021, 14:39:46 UTC	Improvements around glslang optimization (#1916) * #include an absolute path didn't work - because paths were taken to always be relative. * Improvements to how glslang optimizations are handled. Increasing the amount of ids to allow for optimizing more complex shaders, and compacting to try and keep ids within range for usage. * Add other optimizations from current glslang. * Improved -O1/default GLSLANG passes improves output SPIR-V size (Can be more that 1/3 of the size), for a modest increase in compilation time. * Improve comments.	29 July 2021, 14:39:46 UTC
c6f6ce1	Yong He	28 July 2021, 19:24:12 UTC	Experimental DXR1.0 support in gfx. (#1915) * Experimental DXR1.0 support in gfx. - Add `dispatchRays` command. - Add `createRayTracingPipelineState` method to construct a D3D ray tracing state object from a linked slang program and user specified shader table. Limitations/simplifications: no local root signature support, shader table entries contains only shader identifiers and is specified at pipeline creation time, owned by the pipeline state object. * Root object binding for raytracing pipelines. * `maybeSpecializePipeline` implementation for raytracing pipelines. * Add ray-tracing-pipeline example. * Fixes. * Update README.md * Update comments on the lifespan of specialized pipelines Co-authored-by: Yong He <yhe@nvidia.com> Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>	28 July 2021, 19:24:12 UTC
23d406f	Theresa Foley	21 July 2021, 19:52:08 UTC	Work to mitigate SPIR-V bloat (#1914) * Work to mitigate SPIR-V bloat SPIR-V is not an especially compact format, but some patterns in how Slang generates code and then runs it through `spirv-opt` lead to many redundant field-by-field copy operations being emitted. This change attempts to address some of the resulting bloat from the Slang side of things. Note: experimentation shows that the bloat is less pronounced when running either no SPIR-V optimizations or full SPIR-V optimizations, so it is also likely that the bloat should be addressed by changing which `spirv-opt` passes the Slang compiler runs in default (`-O1`) builds. Such changes should come as a distinct pull request. This change primarily does two things: First, the code generation strategy for passing arguments to `out` and `inout` parameters has been changed. In the past, the compiler would always copy the argument value into a temporary, then pass the address of the temporary, and then write back the value after the call. The new code generation strategy attempts to identify when an argument value already has a simple address in memory and passes that address directly when possible. This eliminates many copy operations that occur before/after calls to functions with `out`/`inout` parameters. Second, we introduce an IR optimization pass that detects call sites where the entire contents of a buffer (usually a constant buffer) is being passed to a callee function, such that many bytes are loaded and then passed even if only very few are used in the callee. The pass moves the load operations from the caller to a specialized version of the the callee where possible (e.g., when the constant buffer in question is a global shader parameter). Doing this eliminates another major category of copies. Notes: * The IR lowering logic is complicated by the fact that several kinds of l-values (values that are usable as the desitnation of assignment, or for `out`/`inout` arguments) are not actually addressable. An easy example is a non-contiguous swizzle like `v.xwz` on a `float4`, where the value occupies 12 bytes, but not 12 consecutive bytes with a single address. There are many more corner cases like that and the IR lowering pass carries a lot of complexity to deal with them. A more systematic overhaul is due some time soon. * The IR representation of `out` and `inout` parameters deserves some careful scrutiny when making these kinds of changes. The official semantics of `inout` in HLSL has been "copy in copy out" (and `out` is just "copy out") which is observably different from any solution that passes in the address of an l-value directly. By making this change we are saying that Slang's semantics are not precisely those of legacy HLSL, and that our semantics for `inout` parameters are closer to those of `inout` in Swift or of a mutable borrow in Rust. In the Swift case the implementation can freely pass the underlying storage of an l-value or the address of a temporary, and valid programs may not observe the different. It is thus illegal to observe the value in a storage local while a mutation to that location is "in flight." All of this is way more detailed and technical than 99% of Slang users will ever care about, but importantly it gives us semantic cover to eliminate these copies in the IR, and also to emit output C++ code that implements `out` and `inout` as by-reference parameter passing. * There was an exsting generic pass for specializing functions based on call sites that uses a "template method" style of pattern to customize its behavior. That pass needed to be generalized to handle this use case because it had previously operated on the assumption that the "desire" to specialize a callee function must be driven by the parameter declarations of that function, and not on the argument values passed in. The code has been slightly refactored to allow the policy for specialization to consider both parameters and arguments. * Unsurprisingly, a bunch of the GLSL (and thus SPIR-V) generated has changed with this work, so several baseline `.slang.glsl` files needed to be updated. * This change is incomplete in that it does not address broader cases of buffer loads, including both partial loads from constant buffers (just loading one field, but a field that uses a "large" structure type), and loads from multi-element buffers (a lot from a structured buffer where the element type is "large"). The main question in each of those cases is how to define how "large" a structure needs to be before we decide to try and sink loads into callee functions like this. In the worst case, sinking loads in this way may actually create more memory traffic (because the same values get loaded in multiple callee functions). * fixup: run premake * fixup: typo	21 July 2021, 19:52:08 UTC
e57ea94	jsmall-nvidia	20 July 2021, 19:17:52 UTC	Option to build all slang-repro in a directory (#1911) * #include an absolute path didn't work - because paths were taken to always be relative. * Add repro-directory feature. * Added writer support to repro-directory. * Upgrade glslang to 11.5.0 * Add -load-repro-directory option * Improve repro doc.	20 July 2021, 19:17:52 UTC
f9f8d3e	Yong He	20 July 2021, 17:22:20 UTC	Minor refactor to gfx D3D12 implementation. (#1913) * Minor refactor to gfx D3D12 implementation. - Allow more flexible collection of shader stages in a shader program. - Add `createRayTracingPipelineState` public interface. (no implementation). * Fix Vulkan initialization. Co-authored-by: Yong He <yhe@nvidia.com>	20 July 2021, 17:22:20 UTC
6162950	Yong He	19 July 2021, 21:47:34 UTC	Enable swiftshader in linux CI builds (#1909)	19 July 2021, 21:47:34 UTC
b00e2dc	jsmall-nvidia	16 July 2021, 12:59:50 UTC	Upgrade glslang to 11.5.0 (#1910) * #include an absolute path didn't work - because paths were taken to always be relative. * Upgrade glslang to 11.5.0 * Remove no longer needed section from glslang-generated/README.md	16 July 2021, 12:59:50 UTC
073bcc1	Andrew Adams	09 July 2021, 22:19:46 UTC	Fix typo in comment. (#1907) Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>	09 July 2021, 22:19:46 UTC
7b447bd	jsmall-nvidia	09 July 2021, 20:29:23 UTC	Make Scope non ref counted (#1904) * Add debug symbols for release build. * Hack to try and capture failing compilation. * Typo fix for capture hack. * Specify return type on lambdas. * Added const. * Try breakpoint. * Up count * Let's capture everything so we can valgrind. * Disable always writing repros. * Make Scope non RefCounted. * Fix issue with not serializing Scope. * More comments around changes to Scope. Remove Scope* from serialization. * Remove code used for testing original issue.	09 July 2021, 20:29:23 UTC
fa565f9	Yong He	09 July 2021, 18:00:44 UTC	Enable testing with Swiftshader. (#1906)	09 July 2021, 18:00:44 UTC
09a251e	Yong He	09 July 2021, 13:12:34 UTC	[gfx] Inline raytracing fixes in response to code review. (#1905) Co-authored-by: Yong He <yhe@nvidia.com>	09 July 2021, 13:12:34 UTC
aba2731	Yong He	08 July 2021, 20:55:21 UTC	Allow render-test to run inline ray tracing tests. (#1903) * Update VS projects to 2019. * Empty commit to trigger build * Implement gfx inline ray tracing on D3D12. * Allow render-test to run inline ray tracing tests. Co-authored-by: Yong He <yhe@nvidia.com>	08 July 2021, 20:55:21 UTC
0995067	Yong He	08 July 2021, 20:30:17 UTC	Implement gfx inline ray tracing on D3D12. (#1902) * Update VS projects to 2019. * Empty commit to trigger build * Implement gfx inline ray tracing on D3D12.	08 July 2021, 20:30:17 UTC
06c4926	Yong He	08 July 2021, 20:04:53 UTC	Update VS projects to 2019. (#1901) * Update VS projects to 2019. * Empty commit to trigger build	08 July 2021, 20:04:53 UTC
a03d21a	Yong He	30 June 2021, 21:59:18 UTC	[gfx] Add inline ray tracing support. (#1899)	30 June 2021, 21:59:18 UTC
5395ef8	jsmall-nvidia	30 June 2021, 17:44:43 UTC	Fix uninitialized access identified by Valgrind (#1898) * #include an absolute path didn't work - because paths were taken to always be relative. * Removes read of uniitialized data identified via valgrind runs. Co-authored-by: Theresa Foley <tfoleyNV@users.noreply.github.com>	30 June 2021, 17:44:43 UTC
9dfaabe	Theresa Foley	25 June 2021, 22:50:29 UTC	Fixes related to combined texture/sampler types (#1894) * Fixes related to combined texture/sampler types Work on #1891 Our intention has always been to support combined texture/sampler types in Slang, both targets like OpenGL where that is the only option available and for targets like Vulkan where it can be beneficial to performance. Because Slang's current users mostly focus on D3D12+Vulkan codebases, they strongly prefer separate textures and samplers, and the relevant support code in Slang has "bit-rotted" over many releases until the functionality that was there isn't useful any more. This change significantly overhauls the implementation of combined texture/sampler types and adds a test that uses them in the hopes of avoiding future regressions. The new combined texture/sampler types use the prefix `Sampler`, so where there is an existing standalone `Texture2D` type the equivalent combined texture/sampler type will be `Sampler2D`. The intention is that this naming mirrors the GLSL conventions (where the type is `sampler2d`) while following HLSL naming precedent (to the extent it exists). The operations available on a `Sampler2D` are intended to be those that are available on a `Texture2D`, and it is just that in cases where the `Texture2D` operation would take a separate sampler argument: Texture2D t = ...; SamplerState s = ...; float4 result = t.Sample(s, uv); the equivalent `Sampler2D` operation just elides that argument: Sampler2D s = ...; float4 result = s.Sample(uv); In terms of implementation, there are a lot of subtle details here: * I've tried to use the same metaprogramming logic that generates all the stdlib declarations for `Texture` to also generate `Sampler` in the hopes that this helps keep them in sync. * The big catch to the above is that it means that for certain operations the indidces of parameters depend on whether or not an explicit sampler parameter is used. Rather than try to tweak the indices in the stdlib generation logic (which is already complicated) I went and added Yet Another Hack to the logic that handles intrinsic definition strings. Basically, the special-case handling of `$p` has been modified so that it also applies a negative offset to future parameter references in the same intrinsic string. * Trying to actually bring this up in our test framework revealed that the "flat" reflection API was seemingly not reflecting combined texture/sampler types correctly at all (it was reflecting them as just plain textures). Other than that issue, the Vulkan path seems to work fine with combined texture/samplers. * I also had to add logic to the `TEST_INPUT` parsing to re-introduce handling of the combined types (that was something I consciously left out to reduce the amount of code in the earlier refactor there). Luckily, the architecture is such that a combined texture/sampler can leverage most of the existing logic for the separate cases. * fixup: reveiw feedback	25 June 2021, 22:50:29 UTC
81c1692	Yong He	25 June 2021, 21:42:00 UTC	Update Vulkan headers (#1896) Co-authored-by: Theresa Foley <tfoleyNV@users.noreply.github.com>	25 June 2021, 21:42:00 UTC
962a660	jsmall-nvidia	25 June 2021, 16:08:18 UTC	Small fixes to make compile on CentOS (#1897) * Fix CentOS/gcc compile issue.	25 June 2021, 16:08:18 UTC
5427411	jsmall-nvidia	24 June 2021, 14:00:23 UTC	Remove StructTag and associated systems (#1895) * #include an absolute path didn't work - because paths were taken to always be relative. * Remove StructTag and associated systems. * Fix typo and remove unit test for StructTag.	24 June 2021, 14:00:23 UTC
353777e	Yong He	23 June 2021, 19:14:14 UTC	Change default VS version to 2017. (#1893)	23 June 2021, 19:14:14 UTC
0f2d11d	Yong He	23 June 2021, 18:33:30 UTC	[gfx] Add `IBufferResource::getDeviceAddress()`. (#1892)	23 June 2021, 18:33:30 UTC
0afa24a	jsmall-nvidia	18 June 2021, 21:09:35 UTC	StructTag versioning (#1888) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP Abi struct. * Use AbiSystem on SessionDesc. * Use mask/shift constants. * Fix issue causing warning on linux. * Abi -> Api. * Fix typo. * Refactor to use StructTag. * Mechanism to be able to follow fields. * Field adding is working. * WIP with StructTagConverter. * First pass of StructTag appears to work. Still needs diagnostics. * Small tidy up around Field. * Use bit field to record what fields are recorded to remove allocation around the m_stack. Use ScopeStack for RAII. * Return SlangResult instead of pointers. * Use SlangResult with copy. * Split StructTagConverter implementations. * Fix some bugs around lazy converting. * First pass at unit test for StructTag. * Testing StructTag going backwards in time. * First pass as StructTag diagnostics. * Make Traits a namespace. * Fix some issues with Traits not being a class. * Fix 32 bit warning.	18 June 2021, 21:09:35 UTC
8905125	Theresa Foley	16 June 2021, 21:35:25 UTC	Initial support for variadic macros (#1887) This change adds support for variadic macros in the C-style preprocessor, e.g.: #define DEBUG(MSG, ...) print(__FILE__, __LINE__, MSG, __VA_ARGS__) Similar to the gcc preprocessor, this feature supports both named variadic macro parameters and unnamed ones (which then default to `__VA_ARGS__`. The implementation work is mostly straightforward, although there are a few subtle design choices worth mentioning: * A variadic macro is represented by it having a variadic parameter that is part of the ordinary parameter list. * Argument parsing does not detect whether the macro being invoked is variadic and collect/combine arguments to form a single argument value for the variadic parameter. This is motivated by the need for some extensions to differentiate a variadic parameter receiving a single empty argument vs. zero arguments. * Because any reference to the variadic parameter needs to expand to the comma-separated arguments that match it, the logic for turning a macro parameter reference into a list of tokens has been factored out into a subroutine that handles the details. * The choice in the earlier refactor to have a macro invocation collect all the argument tokens (including the intervening commas) into a single token list seems to pay off here, because it means that the tokens in the expansion of a variadic parameter reference were already stored contiguously. * The special-case logic for handling an empty argument list had to be tweaked again to ensure that an empty argument list is treated as having zero arguments for the variadic parameter. Note that historically C did not define the behavior of this case, and always required at least one argument for any variadic macro parameter. * The logic for checking whether the number of arguments to a macro invocation is valid needed to handle variadic and non-variadic macros as distinct cases. There really isn't much overlap in how the checks need to work, even if we tried to change the underling representation. The main missing feature here is any way to discard a comma in a macro body that appears before a variadic parameter reference, e.g.: #define DEBUG(...) print("debug:", __VA_ARGS__) In this case, an empty invocation list `DEBUG()` will expand to `print("debug:",)` - a call with a trailing comma in the argument list. If users end up needing a way to discard commas in cases like this, we have many options we can consider. This change does not implement any of them to keep the initial work as minimal as possible.	16 June 2021, 21:35:25 UTC
45f737d	jsmall-nvidia	14 June 2021, 14:36:42 UTC	Improve comments around -Xarg and C++/CUDA layout (#1884) * #include an absolute path didn't work - because paths were taken to always be relative. * Alter comments around layout size/alignment to reflect nuance on C++/CUDA. * Fix some errors in -X documentation, and clarify some of the behavior. * Small doc improvements.	14 June 2021, 14:36:42 UTC
746ee0d	Yong He	11 June 2021, 19:49:04 UTC	Properly fill `declref` in `Linkage::getContainerType`. (#1882) * Properly fill `declref` in `Linkage::getContainerType`. * Fix timestamp query on cpu * Fix typo. Co-authored-by: Yong He <yhe@nvidia.com>	11 June 2021, 19:49:04 UTC
37e8917	jsmall-nvidia	10 June 2021, 18:57:09 UTC	CUDA layout corner cases/testing (#1881) * #include an absolute path didn't work - because paths were taken to always be relative. * Add support for sizeOf/alignOf/offsetOf to stdlib. Add $G intrinsic expansion that works of the generic parameters not the param type * Test cuda layout. * Fix CUDA layout issues. Fix reflection to handle other built in types. Fix __offsetOf * Tests of reflection and layout as reported directly from CUDA. * Comment about use of aligned size as size. * Fix warning from VS. * Check alignment is pow2. * Small improvements to alignment calcs. * Tab to spaces. * Fix alignment pointer sizes on 32 bit OS for CUDA. * Fix CUDA reflection on 32 bit.	10 June 2021, 18:57:09 UTC
0d9bd79	Yong He	10 June 2021, 07:30:19 UTC	Support timestamp queries in `gfx`. (#1880) * Support timestamp queries in `gfx`. * Fix tab Co-authored-by: Yong He <yhe@nvidia.com>	10 June 2021, 07:30:19 UTC
86b0d74	jsmall-nvidia	09 June 2021, 18:00:44 UTC	Enable some VK texture tests (#1878)	09 June 2021, 18:00:44 UTC
430b832	Yong He	09 June 2021, 14:34:52 UTC	Fix CUDA vector layout logic. (#1879) And rename debug symbols for navis. Co-authored-by: Yong He <yhe@nvidia.com>	09 June 2021, 14:34:52 UTC
6cee1ee	Yong He	08 June 2021, 14:44:05 UTC	Various fixes to CUDA backend. (#1877) - Fix emitting `StructuredBuffer<ISomething>::Load`, which triggers emitting for `IROp_WrapExistential` that is previously unhandled. - Fix cuda layout around vectors, they should be aligned to 1,2,4,8,16 bytes instead of just using element type's alignment. That means `float4` has alignment of 16 instead of 4. - Fix `SLANG_CUDA_HANDLE_ERROR` macro definition. - Fix navis sometimes fail to find `Slang::kIROp_*` enum values when debugging external projects. Co-authored-by: Yong He <yhe@nvidia.com> Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>	08 June 2021, 14:44:05 UTC
fb50fab	jsmall-nvidia	08 June 2021, 12:48:47 UTC	Fix RWTexture issues on CUDA (#1876) * #include an absolute path didn't work - because paths were taken to always be relative. * Re-enable CUDA RWTexture tests. Re-enable RWTexture1D test Make sure tests have only single mip for RWTexture (required for CUDA) * Fix issue with reading CUDA surface. Re-enable working CUDA RWTextureTest. Enable 1D case.	08 June 2021, 12:48:47 UTC
5974f3e	jsmall-nvidia	06 June 2021, 17:20:42 UTC	Made 'slang-test' the startup project on visual studio (#1873) * #include an absolute path didn't work - because paths were taken to always be relative. * Make slang-test the startup project.	06 June 2021, 17:20:42 UTC
bbd6df7	jsmall-nvidia	06 June 2021, 16:43:19 UTC	Fixed issue around 4xFloat16 texture on CUDA (#1874) * #include an absolute path didn't work - because paths were taken to always be relative. * Fixes around Float16. Incorrect calculation of 'elementSize'.	06 June 2021, 16:43:19 UTC
688d5fa	T. Foley	06 June 2021, 16:27:19 UTC	Include a "stack trace" with nested-import errors (#1872) * Include a "stack trace" with nested-import errors When errors occur in nested `#include` files it is often helpful to have a "stack trace" / traceback of the `#include` chain that led from a root translation unit to the file with an error. This change implements a similar feature for `import`s. It is worth noting that `import`s don't really require this kind of compiler support the way `#include`s do because the intention is that the meaning of an `import`ed file does not depend on the order or nesting of `import`s. As such, when trying to fix an error in an `import`ed file, you usually don't care how it came to be `import`ed into your shaders. The use case here is somebody adapting a large body of Slang code to use in a different codebase, such that they have certain `.slang` files they don't actually intend to have compile correctly, and they want to be able to diagnose how they came to include those files when/if they cause problems. The actual feature implementation is pretty simple because we already track a stack of active `import`s so that we can detect and diagnose recursive `import`s. This change simply changes the disagnostics when there is an error in imported code so that instead of just noting the inner-most `import` site it lists all the `import` sites that were active at the time. The change includes a test case to confirm that the behavior works (at least for the case of a parse error). * fixup: test outputs Co-authored-by: Yong He <yonghe@outlook.com> Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>	06 June 2021, 16:27:19 UTC
58d7af3	T. Foley	06 June 2021, 15:59:28 UTC	Fix a bug in preprocessor "busy" logic (#1875) * Fix a bug in preprocessor "busy" logic This bug manifested in both incorrect preprocessor output for certain complicated cases, and also (more importantly) a use-after-free issue. One of the "clever" design choices in the Slang preprocessor implementation is that the set of "busy" macros during expansion is implicitly defined by a linked list of those invocations that are actively being read from as part of the input stack. This logic works very well for checking whether a macro name is busy before triggering expansion, and for computing what macros should be considered busy during expansion of an object-like macro. The problem case here was with function-like macros where the preprocessor was re-using the same list of busy macros that had been fetched when reading the macro name, but doing so after all the macro argument tokens had been read. Because additional tokens had been read from the input stream stack, there was no guarantee that the invocations that had been active before were still live. The new logic computes the set of busy macros fresh before starting expansion of a function-like macro invocation. A test is included to ensure that the case that showed the use-after-free bug has been fixed. In addition, the new logic is careful to compute the busy macros only based on where subsequent tokens will come from and not based on any macros that might have contributed to the argument tokens themselves. A test case has been added that relies on getting this detail correct. * Update slang-preprocessor.cpp Remove a test typo. Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>	06 June 2021, 15:59:28 UTC
1617c2d	Nathan V. Morrical	04 June 2021, 23:18:14 UTC	Enable tracing rays with OptiX backend (#1871) * OptiX ray payload can now be read from and written to using the two payload register pointer method * changing op to more descriptive name * small tweak to allow for dumping out intermediate source for cuda targets * initial trace ray call compiling * hit attributes now work for float and int types, and vectors thereof * Hitgroups using structs and arrays now work with optix Co-authored-by: T. Foley <tfoleyNV@users.noreply.github.com>	04 June 2021, 23:18:14 UTC
95a90d7	Yong He	04 June 2021, 21:32:22 UTC	Fix D3D11 `uploadBufferResource`. (#1869)	04 June 2021, 21:32:22 UTC
bf068b1	Yong He	04 June 2021, 17:39:21 UTC	Escape `\` in slang-embed. (#1870)	04 June 2021, 17:39:21 UTC
e67af5b	Yong He	02 June 2021, 23:58:25 UTC	Various Fixes to gfx, reflection and emit. (#1867) * Various Fixes to gfx, reflection and emit. - Fix GLSL emit to properly output `bitsTo` functions for `IRBitCast` insts. - Add line directive mode setting for `ISession`. - Extend `TypeLayout::getElementStride` to handle `VectorType` case. - Fix `IDevice::readBufferResource` 's D3D12 implementation to copy only the requested bytes out. - Fix `render-test` to use the `ISession` from `gfx` instead of creating its own `ISession` to make sure `gfx` and `render-test` agree on WitnessTable and RTTI IDs. - Extend `render-test` to support filling vector and matrix values in the new `set x = ...` TEST_INPUT syntax. - Add a `dynamic-dispatch-15` test case to make sure packing / unpacking works correctly across all targets, and to make sure render-test's RTTI/WitnessTable ID filling logic is correct for non-trivial cases. * Remove default-major test * Fix cyclic reference in `ExtendedTypeLayout`. * Move `lineDirectiveMode` setting to `TargetDesc`. Add `structureSize` to `TargetDesc` and `SessionDesc` for future binary compatibility. * Cleanup. Co-authored-by: Yong He <yhe@nvidia.com>	02 June 2021, 23:58:25 UTC
8e57166	Yong He	02 June 2021, 18:37:45 UTC	Fix cyclic reference in `ExtendedTypeLayout`. (#1868)	02 June 2021, 18:37:45 UTC
7e93e81	jsmall-nvidia	02 June 2021, 17:44:04 UTC	Downstream compiler location fix. (#1866) * #include an absolute path didn't work - because paths were taken to always be relative. * Move logic to determine how to load a downstream shared library to DownstreamCompilerUtil::loadSharedLibrary. Handle if the given path is directory or a file. * Improve comment. * Special handling for a downstream compiler loading SharedLibrary detection - take into account filename may be different. * Handle cases if path is set but paths can't be determined. * Fix typo. Co-authored-by: T. Foley <tfoleyNV@users.noreply.github.com>	02 June 2021, 17:44:04 UTC
b699a36	jsmall-nvidia	02 June 2021, 16:58:08 UTC	JSONBuilder (#1865) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP JSONWriter/JSONParser. * Checking different Layout styles for JSON. * Add slang-json-parser.h/.cpp * WIP JSONValue. * Added JSONValue::destroy/Recursive. * Improvement to JSONValue. * Improve text double conversion precision. Testing. * Simplify double parsing (just use atof). JSON comparison More testing of conversions and start of JSONValue. * Add <math.h> for isnan, isinf etc. * Small improvement with object comparison. * Fix typo in getArgsByName. * Removed use of isnan and isinf as includes don't work on linux. * Improve JSON unit test. * Added asInteger/asFloat/asBool to JSONValue. * Add SourceLoc to JSONListener. * Added ability to walk the JSONValue * JSONBuilder. * Add converting from lexemes via JSONBuilder. * Fix VS warning. * Fix warning for res not being used.	02 June 2021, 16:58:08 UTC
7a3c87b	jsmall-nvidia	01 June 2021, 20:58:07 UTC	JSONValue / Container (#1864) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP JSONWriter/JSONParser. * Checking different Layout styles for JSON. * Add slang-json-parser.h/.cpp * WIP JSONValue. * Added JSONValue::destroy/Recursive. * Improvement to JSONValue. * Improve text double conversion precision. Testing. * Simplify double parsing (just use atof). JSON comparison More testing of conversions and start of JSONValue. * Add <math.h> for isnan, isinf etc. * Small improvement with object comparison. * Fix typo in getArgsByName. * Removed use of isnan and isinf as includes don't work on linux. * Improve JSON unit test. * Added asInteger/asFloat/asBool to JSONValue. * Change comment to trigger CI build.	01 June 2021, 20:58:07 UTC
67486ee	jsmall-nvidia	28 May 2021, 21:15:12 UTC	Glslang refactor bugfix (#1863) * #include an absolute path didn't work - because paths were taken to always be relative. * Fix issue with with SLANG_ENABLE_GLSLANG_SUPPORT * Update expected output from glslang-error.glsl * Fix bug in glsl dissassembly. * Make ExtensionTracker available even if source is not emitted. * Only explicitly set extension tracker based on capability bits, if we are in pass through. * Small simplification of invoke sourceEmit.	28 May 2021, 21:15:12 UTC
89faa8a	T. Foley	27 May 2021, 22:05:34 UTC	Fix initializer lists for derived structs (#1862) If the user has a derived `struct` type: ```hlsl struct Base { int b = 1; } struct Derived : Base { int d = 2; } ``` Then it is still reasonable for them to want to use initializer lists when declaring variables using the `Derived` type: ```hlsl Derived x = {}; Derived y = { 7, 8 }; ``` This change implements two missing pieces of functionality in the Slang compiler to allow this case: * First, when the front-end semantic checks are applied to an initializer list, if the type being initialized is a derived `struct` type it always expects to find initialization arguments for its base type before those for its fields. * Second, when lowering an initializer-list expression from the AST to the IR, the compiler expects the first argument in the list to be the initial value for the base field (if any). This also applies to default-initialization of fields/variables. This change slightly entangles front-end logic with the logic for how struct inheritance is lowered to the IR, but the behavior is unlikely to confuse users who expect C++-like layout. It is worth noting that with this change it should be possible to initialize the base type using either a nested initializer list or flat arguments: ```hlsl struct BigBase { int x; int y; int z; } struct BigDerived : BigBase { int w; } BigDerived a = { {1,2,3}, 4 }; BigDerived b = { 1, 2, 3, 4 }; ``` This behavior should Just Work because of the existing C-like rules for initializer lists where an aggregate can be initialized by either a `{}`-enclosed block or distinct values for its leaf fields.	27 May 2021, 22:05:34 UTC
63dcc7a	T. Foley	27 May 2021, 20:40:17 UTC	Fix a bug in struct inheritance (#1861) During lowering from AST to IR, the Slang compiler translates code that uses `struct` inheritance: ```hlsl struct Base { int a; } struct Derived : Base {} ``` into code where the inheritance relationship is "witnessed" by a simple field: ```hlsl struct Base { int a; } struct Derived { Base __anonymous_field__; } ``` The underlying bug here is that the `__anonymous_field__` that the compiler generated during IR lowering was not being given any linkage decorations (no mangled name). As a result, if multiple separately-compiled modules all access that field they could disagree on its identity as an IR instruction. This could lead to output code being generated where the declaration of `__anonymous_field__` uses one IR instruction, but accesses use another. This change includes a fix for the issue, and a test that serves as a reproducer for the original problem.	27 May 2021, 20:40:17 UTC
15f5cff	jsmall-nvidia	27 May 2021, 19:38:52 UTC	JSON Parser and Writer (#1859) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP JSONWriter/JSONParser. * Checking different Layout styles for JSON. * Add slang-json-parser.h/.cpp	27 May 2021, 19:38:52 UTC
969943f	Nathan V. Morrical	26 May 2021, 13:15:08 UTC	small tweak to allow for dumping out intermediate source for cuda targets (#1855) Co-authored-by: T. Foley <tfoleyNV@users.noreply.github.com> Co-authored-by: Yong He <yonghe@outlook.com> Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>	26 May 2021, 13:15:08 UTC
2a2f50f	T. Foley	26 May 2021, 13:00:26 UTC	Fix a bug for enumerations with explicit "tag type" (#1856) The basic bug here was that `enum` types with an explicit tag type: enum Color : int32_t { ... } would have an `InheritanceDecl` implying that `Color` inherits from `int32_t`. The problem is that this is not actually an inheritance relationship, since a `Color` needs to be explicitly cast to/from an `int32_t`. Various parts of the compiler currently treat this case like real inheritance, and as a result the operations taht would apply to an `int32_t` end up applying to a `Color` as well. This particularly leads to an ambiguity between applying the `==` operator, because it has overloads for both the `__EnumType` and `__Builtin{something}` interfaces. The fix here is to explicitly exclude the `InheritanceDecl` that represents an enumeration tag type when considering declared subtype relationships. A more complete version of this fix would need to go through all places in the code where `InheritanceDecl`s are used and make sure that any places using them for true inheritnace relationships ignore those that represent an enumeration tag type. (An alternative option would be to use a distinct kind of `Decl` to represent the tag-type relationship, perhaps even going so far as to modifying the type of the relevant AST node as part of semantic checking) This change includes a regression test for the way this bug surfaced in user code. Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>	26 May 2021, 13:00:26 UTC
7d1b8ac	jsmall-nvidia	26 May 2021, 00:58:43 UTC	JSON Lexing and string encoding/decoding (#1858) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP Json lexer. * Check JSON Lex with unit test * Add JSON escaping/unescaping of strings. * Big fix encoding/decoding. * Fix typo in JSON diagnostics. * Fix typo. * Better float testing.	26 May 2021, 00:58:43 UTC
89f67d9	Yong He	25 May 2021, 22:22:39 UTC	Rework shader object specialization control interface. (#1857)	25 May 2021, 22:22:39 UTC
ba24264	Yong He	25 May 2021, 17:24:38 UTC	Allow overriding specialization args via `IShaderObject`. (#1854) * Allow overriding specialization args via `IShaderObject`. * Fixes. Co-authored-by: T. Foley <tfoleyNV@users.noreply.github.com>	25 May 2021, 17:24:38 UTC
fbf00dd	Nathan V. Morrical	25 May 2021, 17:06:54 UTC	OptiX ray payload read/write support in raytracing pipeline shaders (#1853) * OptiX ray payload can now be read from and written to using the two payload register pointer method * changing op to more descriptive name * fixup: comment change to re-trigger CI Co-authored-by: T. Foley <tfoleyNV@users.noreply.github.com>	25 May 2021, 17:06:54 UTC
34a1ff5	jsmall-nvidia	22 May 2021, 20:03:30 UTC	Improvements in -X support (#1852) * #include an absolute path didn't work - because paths were taken to always be relative. * Added SourceLoc handling for command line parsing. * Fix typo in debug. * Fix issue around the DiagnosticSink used in options parsing not having a writer available - by having DiagnosticSink parenting. * Small rename for clarity. * WIP extracting command line args for downstream tools. * Unit tests/bug fixes around extracting args. * Use DownstreamArgs in the EndToEndCompileRequest * Passing downstream compiler options downstream. * Fix issue with endToEndReq being nullptr. * Fix issue with diagnostics number change. * Small improvements to how the source line is displayed if it's too long. Default to 120, as suggested in previous review. * Make render test use x-args parsing and CommandArgReader. * Added missing diagnostics. * More DownstreamArgs to linkage so can be seen by 'components'. Added dxc-x-arg test. * Used combination of name and args instead of two Lists, which whilst equivalent was perhaps a little confusing. * Added documentation for -X support. * Added test for x-args parsing diagnostic. Improved diagnostic with list of known names. * Fix issues from merge. * Fix lookup for -matrix-layout-column-major in render test. * Remove commented out line.	22 May 2021, 20:03:30 UTC
7f8a999	Yong He	21 May 2021, 23:38:33 UTC	[gfx] Support StructuredBuffer<IInterface>. (#1851) Co-authored-by: T. Foley <tfoleyNV@users.noreply.github.com>	21 May 2021, 23:38:33 UTC
172538f	jsmall-nvidia	21 May 2021, 22:41:54 UTC	Downstream option handling (#1850) * #include an absolute path didn't work - because paths were taken to always be relative. * Added SourceLoc handling for command line parsing. * Fix typo in debug. * Fix issue around the DiagnosticSink used in options parsing not having a writer available - by having DiagnosticSink parenting. * Small rename for clarity. * WIP extracting command line args for downstream tools. * Unit tests/bug fixes around extracting args. * Use DownstreamArgs in the EndToEndCompileRequest * Passing downstream compiler options downstream. * Fix issue with endToEndReq being nullptr. * Fix issue with diagnostics number change. * Small improvements to how the source line is displayed if it's too long. Default to 120, as suggested in previous review. Co-authored-by: T. Foley <tfoleyNV@users.noreply.github.com>	21 May 2021, 22:41:54 UTC
0389546	T. Foley	21 May 2021, 22:07:21 UTC	Overhaul the preprocessor (#1849) * Overhaul the preprocessor The old Slang preprocessor was based on a simple mental model that tried to unify two parts of macro expansion: * Scanning for macro invocations in a sequence of tokens * Producing the expanded tokens for a macro expansion by substituting arguments into its body The basic was that substitution of macro arguments into a macro definition is superficially similar to top-level macro expansion, just with an environment where the macro arguments act like `#define`s for the corresponding parameter names. That approach was "clever" and could conceivably have been extended to include a lot of advanced preprocessor features (e.g., a preprocessor-level `lambda` would be easy to support!), but it was basically impossible to make it correctly handle all the corner cases of the full C/C++ preprocessor. The fundamental problem with the old approach was that it conflated the two parts of expansion listed above into one implementation, while the various special cases of the C/C++ preprocessor rely on treating the two cases very differently. The new approach here (which is somewhere between a refactor and a full rewrite of the preprocessor) changes things up in a few key ways: * The abstraction still cares a lot about streams of tokens, but it now treats the top level streams (`InputFile`s) as fairly different from the lower-level streams (`InputStream`s) * Macro expansion is handled as a dedicated type of stream that wraps another stream. This allows macro expansion to be applied to anything, and supports cases where multiple rounds of macro expansion are required by the spec. * Macro invocations and the substitution of their arguments are now handled by a completely new system. * Macro arguments are no longer treated as if they were `#define`s * The macro body/definition is analyzed at definition time to detect various kinds of issues, and to derive a list of "ops" that make it easier to "play back" the definition at substitution time * Token pasting and stringizing are now only handled in macro definitions (rather than being allowed anywhere), and their use cases are restricted to only those that make sense (e.g., you can't stringize anythign except a macro parameter, because anything else wouldn't make sense) The key new types here are the `ExpansionInputStream` which handles scanning for macro invocations, and the `MacroInvocation` type, which handles playing back the macro body with substitutions. The `ExpansionInputStream` is the easier of the two to understand. By refactoring it to use a single token of lookahead, the one major detail it had to deal with before (abandoning expansion of a function-like macro if the macro name was not followed by `(`) is significantly easier to manage. The more subtle part is the `MacroInvocation` type, and most of the complexity there is around handling of token pasting, and the fact that either or both of the operands to a token paste might be empty. Many of the test cases that exposed the problems in the preprocessor have been moved from `current-bugs` to `preprocessor` since they now work correctly. * debugging: enable extractor command line dump * fixup * fixup	21 May 2021, 22:07:21 UTC
c4c90f5	jsmall-nvidia	19 May 2021, 21:53:24 UTC	SourceLoc use in command line processing (#1848) * #include an absolute path didn't work - because paths were taken to always be relative. * Added SourceLoc handling for command line parsing. * Fix typo in debug. * Fix issue around the DiagnosticSink used in options parsing not having a writer available - by having DiagnosticSink parenting. * Small rename for clarity. Co-authored-by: T. Foley <tfoleyNV@users.noreply.github.com>	19 May 2021, 21:53:24 UTC
61e9154	jsmall-nvidia	19 May 2021, 19:57:11 UTC	Glslang as DownstreamCompiler (#1846) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP Fxc as downstream compiler. * First pass FXC downstream compiler working. * GCC compile fix. * Fix FXC parsing issue. * Special case filesystem access. * Use StringUtil getSlice. * Fix isses with not emitting source for FXC. * WIP on DXC. * Small fixes for DXBC handling. * Removed DXC from ParseDiagnosticUtil (can use generic) Try to improve output for notes from DXC. * FIrst pass of Glslang as DownstreamCompiler * Fix some problems with parsing for glslang replacement. * Add slang-glslang-compiler.cpp/.h * Fix downstream for spir-v output. * dissassemble -> disassemble * Fix typo and improve some naming/comments. * Remove getSharedLibrary from DownstreamCompiler * Removed some no longer used diagnostics.	19 May 2021, 19:57:11 UTC
d5e8044	jsmall-nvidia	15 May 2021, 15:45:58 UTC	Read half->float RWTexture conversion (#1842) * #include an absolute path didn't work - because paths were taken to always be relative. * Fix for writing to RWTexture with half types on CUDA. * CUDA half functionality doc updates. * First pass support for sust.p RWTexture format conversion on write. * Tidy up implementation of $C. Made clamping mode #define able. * A simple test for RWTexture CUDA format conversion. * Add support for float2 and float4. * WIP conversion testing. * Use $E to fix byte addressing in X in CUDA. * Do not scale when accessing via _convert versions of surface functions. * Revert to previous test. * Test with half/float convert write/read. * More broad half->float read conversion testing. * Improve documentation around half and RWTexture conversion.	15 May 2021, 15:45:58 UTC
bfe7561	jsmall-nvidia	15 May 2021, 15:22:14 UTC	Surface access on CUDA is byte addressed in X (#1841) * #include an absolute path didn't work - because paths were taken to always be relative. * Fix for writing to RWTexture with half types on CUDA. * CUDA half functionality doc updates. * First pass support for sust.p RWTexture format conversion on write. * Tidy up implementation of $C. Made clamping mode #define able. * A simple test for RWTexture CUDA format conversion. * Use $E to fix byte addressing in X in CUDA. * Do not scale when accessing via _convert versions of surface functions.	15 May 2021, 15:22:14 UTC
1027225	jsmall-nvidia	15 May 2021, 14:52:55 UTC	Support for HW format conversions for RWTexture on CUDA (#1840) * #include an absolute path didn't work - because paths were taken to always be relative. * Fix for writing to RWTexture with half types on CUDA. * CUDA half functionality doc updates. * First pass support for sust.p RWTexture format conversion on write. * Tidy up implementation of $C. Made clamping mode #define able. * A simple test for RWTexture CUDA format conversion.	15 May 2021, 14:52:55 UTC
1856b8a	jsmall-nvidia	14 May 2021, 22:38:08 UTC	DXC as DownstreamCompiler (#1845) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP Fxc as downstream compiler. * First pass FXC downstream compiler working. * GCC compile fix. * Fix FXC parsing issue. * Special case filesystem access. * Use StringUtil getSlice. * Fix isses with not emitting source for FXC. * WIP on DXC. * Small fixes for DXBC handling. * Removed DXC from ParseDiagnosticUtil (can use generic) Try to improve output for notes from DXC.	14 May 2021, 22:38:08 UTC
d4316c8	jsmall-nvidia	14 May 2021, 21:50:00 UTC	FXC as DownstreamCompiler (#1844) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP Fxc as downstream compiler. * First pass FXC downstream compiler working. * GCC compile fix. * Fix FXC parsing issue. * Special case filesystem access. * Use StringUtil getSlice. * Fix isses with not emitting source for FXC. * Small fixes for DXBC handling.	14 May 2021, 21:50:00 UTC
79d106f	jsmall-nvidia	14 May 2021, 21:32:52 UTC	Fix for KernelContext threading issue for C++ targets (#1843) * #include an absolute path didn't work - because paths were taken to always be relative. * Fix for issue where threading KernelContext was not working on C++ test when there were multiple invocations. * Improve test for context threading.	14 May 2021, 21:32:52 UTC
12bcc03	jsmall-nvidia	14 May 2021, 20:59:35 UTC	CUDA half RWTexture write support/doc improvements (#1839) * #include an absolute path didn't work - because paths were taken to always be relative. * Fix for writing to RWTexture with half types on CUDA. * CUDA half functionality doc updates.	14 May 2021, 20:59:35 UTC
a2725fd	Tim Foley	06 May 2021, 23:08:15 UTC	Fix for uninitialized field (#1838) The `OverloadedExpr` type didn't provide a default value for its field: Name* name; This led to a null-pointer crash in the logic that deals with synthesizing interface requirements because it creates an `OverloadedExpr` but doesn't initialize the field. This change makes two fixes: 1. The logic in the synthesis path actually initializes `name` so that it can feed into any downstream error messages 2. The `OverloadedExpr` declaration now includes an initial value for `name` so that it will at least be null instead of garbage if we slip up again	06 May 2021, 23:08:15 UTC
8ee5e45	jsmall-nvidia	06 May 2021, 22:09:44 UTC	Support for reads from RWTexture<half> (#1837) * #include an absolute path didn't work - because paths were taken to always be relative. * Split out StringEscapeUtil. * Added StringEscapeUtil. * Fix typo in unix quoting type. * Small comment improvements. * Try to fix linux linking issue. * Fix typo. * Attempt to fix linux link issue. * Update VS proj even though nothing really changed. * Fix another typo issue. * Fix for windows issue. Fixed bug. * Make separate Utils for escaping. * Fix typo. * Split out into StringEscapeHandler. * Windows shell does handle removing quotes (so remove code to remove them). * Handle unescaping if not initiating using the shell. * Slight improvement around shell like decoding. * Simplify command extraction. * Add shared-library category type. * Fix bug in command extraction. * Typo in transcendental category. * Enable unit-test on in smoke test category. * Make parsing failing output as a failing test. * Fixes for transcendental tests. Disable tests that do not work. * Changed category parsing. * Removed the TestResult parameter from _gatherTestsForFile. Made testsList only output. * Remove testing if all tests were disabled. * Make args of CommandLine always unescaped. * Add category. * Don't need escaping on unix/linux. * Remove some no longer used functions. * Add requireSMVersion to CUDAExtensionTracker. * half-calc.slang now works for CUDA. * bit-cast-16-bit works on CUDA. * WIP handling of CUDA vector<half> types. * Half swizzle CUDA. * Half vector test. * Fix swizzle half bug. * Fix compilation issue with narrowing to Index. * Add unary ops. * Add some vector scalar maths ops. * Add half vector conversions for CUDA. * Fix erroneous comment. * Support for half comparisons. * First pass test for half compare. * Fix bug in CUDA specialized emit control. Updated tests to have pre and post inc/dec. * Removed unneeded parts of the cuda prelude. * Half structured buffer works on CUDA. * Added name lookup for Gfx::Format * Support half texture type in test system. * Test for half reading on CUDA. * Add half formats to Vk and D3D utils. * Fix getAt for CUDA - where there might not be a .x member in a vector. * Template specialization for half surface access works. * Half RWTexture support. * Test for half RWTexture access. * Update half-rw-texture test. * Remove test function from CUDA prelude.	06 May 2021, 22:09:44 UTC
e510a28	jsmall-nvidia	06 May 2021, 16:45:00 UTC	Half texture support (#1836) * #include an absolute path didn't work - because paths were taken to always be relative. * Split out StringEscapeUtil. * Added StringEscapeUtil. * Fix typo in unix quoting type. * Small comment improvements. * Try to fix linux linking issue. * Fix typo. * Attempt to fix linux link issue. * Update VS proj even though nothing really changed. * Fix another typo issue. * Fix for windows issue. Fixed bug. * Make separate Utils for escaping. * Fix typo. * Split out into StringEscapeHandler. * Windows shell does handle removing quotes (so remove code to remove them). * Handle unescaping if not initiating using the shell. * Slight improvement around shell like decoding. * Simplify command extraction. * Add shared-library category type. * Fix bug in command extraction. * Typo in transcendental category. * Enable unit-test on in smoke test category. * Make parsing failing output as a failing test. * Fixes for transcendental tests. Disable tests that do not work. * Changed category parsing. * Removed the TestResult parameter from _gatherTestsForFile. Made testsList only output. * Remove testing if all tests were disabled. * Make args of CommandLine always unescaped. * Add category. * Don't need escaping on unix/linux. * Remove some no longer used functions. * Add requireSMVersion to CUDAExtensionTracker. * half-calc.slang now works for CUDA. * bit-cast-16-bit works on CUDA. * WIP handling of CUDA vector<half> types. * Half swizzle CUDA. * Half vector test. * Fix swizzle half bug. * Fix compilation issue with narrowing to Index. * Add unary ops. * Add some vector scalar maths ops. * Add half vector conversions for CUDA. * Fix erroneous comment. * Support for half comparisons. * First pass test for half compare. * Fix bug in CUDA specialized emit control. Updated tests to have pre and post inc/dec. * Removed unneeded parts of the cuda prelude. * Half structured buffer works on CUDA. * Added name lookup for Gfx::Format * Support half texture type in test system. * Test for half reading on CUDA. * Add half formats to Vk and D3D utils. * Fix getAt for CUDA - where there might not be a .x member in a vector.	06 May 2021, 16:45:00 UTC
85632e8	Tim Foley	04 May 2021, 23:59:54 UTC	Add support for returning structures that contain opaque types (#1835) Introduction ============ Several of our target platforms share a concept of "opaque" types, including resources (`Texture2D`) and samplers (`SamplerState`), which are restricted in how they can be used. GLSL and SPIR-V place very severe restrictions, in that opaque types cannot be used for the type of: * (mutable) local variables * (mutable) global variables * structure fields * Function result/return * `out` or `inout` parameters The HLSL language allows all of these cases, but with the practical caveat that the compiler front-end must be able to statically analyze how opaque types have been used and "optimize away" all of the above cases. For example, it is legal to have a local variable of an opaque type, but at any point where the variable gets used it must be statically known which top-level shader parameter the variable refers to. Existing Work ============= In the Slang compiler we need to implement our own passes to detect these "illegal" uses of opaque types and legalize them. The work is basically broken into two distinct steps: * The existing `legalizeResourceTypes()` pass detects illegal types (e.g., a `struct` that has a field of type `Texture2D`) and replaces them with legal types, sometimes by splitting apart declarations (e.g., a parameter using such a `struct` type gets split into multiple parameters). At a high level, we can think of this as "exposing" opaque types so that they are not hidden inside of nested structures. * Next, the `specializeResourceOutputs()` pass detects calls to functions that output opaque types (whether by the function return value of `out` / `inout` parameters). The pass analyzes the body of such functions, and tries to isolate the logic that determines their resource-type outputs and hoise that logic into call sites (so that the opaque-type outputs can then be eliminated). This Change =========== One important missing case was that the type legalization step was incapable of legalizing types that appear in the result/return type of functions. The existing logic would simply diagnose an internal/unimplemented error if it ecountered a non-simple type in the return position. At a high-level, supporting this case seems simple enough. Given a function signature like: ``` struct Things { int a; Texture2D b; } Things myFunc(int x) { ... } ``` we want to split the result type into an "ordinary" result type and then `out` parameters for any opaque-type fields: ``` struct Things_Legal { int a; } Things_Legal myFunc(int x, out Texture2D result_b) { ... }; ``` Similarly, at a call site to a function like this: ``` Things t = myFunc(99); ``` we split the function result into ordinary and opaque-type parts, and pass the latter as `out` parameters: ``` Texture2D t_b; Things_Legal t = myFunc(99, /out/ t_b); ``` The main place where things get tricky is when dealing with `return` sites within the body of a function that needs legalization: ``` Things myFunc(int x) { ... Things things = ...; ... return things; } ``` In theory the answer is simple: a `return` translates into writes to the `out` parameters for any opaque-type data, followed by a return of the ordinary-type part: ``` Things_Legal myFunc(int x, out Texture2D result_b) { ... Things_Legal things = ...; Texture2D things_b = ...; ... result_b = things_b; return things; } ``` The sticking point here is that this step requires tracking data between the legalization of the parameter list for `myFunc` and legalization of the `return`s in its body, so that we can identify the `result_b` parameter to be able to write to it. The existing type legalization pass was not built with the idea that such communication is commonly needed; it assumes that each instruction can be legalized in isolation, so long as dependencies are respected. This change adds logic such that the `legalizeFunc()` step sets up a data structure that it used to represent information about how a function (and its parameter list) got legalized, so that the logic for a `return` can make use of that legalized information. Right now the information we track consists of just the list of parameters that were introduced to represent a return/result type. Testing ======= In order to confirm what features do/don't work, I added a set of tests that cover a cross-product of opaque type use cases: * The opaque type can be used in the function result type, an `out` parameter, or an `inout` parameter * The opaque type can be used "directly" or nested inside a `struct`. These tests are helpful to make sure we handle the most important cases, but it is worth noting that the coverage is still lacking in that we do not sufficiently test all the options for what the function body might do. An opaque-type function result could be derived from many different sources: * It could be a global shader parameter * It could be an `in` or `inout` parameter of the function itself * It could be wrapped up in one or more structure types * It could be wrapped up in one or more array types (such that the output of specialization needs to pass around array indices) * It could involve use of the type as a local variable (including passing it into other functions with result/`out`/`inout` outputs of opaque types) This change makes it so that we can handle the simplest cases involving result/return types with a wrapper `struct`, and adds test cases that confirm we handle several other cases for `out` and `inout` parameters. Gaining confidence that we cover all the cases that arise in practical shaders will require more work over following changes.	04 May 2021, 23:59:54 UTC
731f1fc	jsmall-nvidia	04 May 2021, 20:24:51 UTC	CUDA half comparison support (#1834) * #include an absolute path didn't work - because paths were taken to always be relative. * Split out StringEscapeUtil. * Added StringEscapeUtil. * Fix typo in unix quoting type. * Small comment improvements. * Try to fix linux linking issue. * Fix typo. * Attempt to fix linux link issue. * Update VS proj even though nothing really changed. * Fix another typo issue. * Fix for windows issue. Fixed bug. * Make separate Utils for escaping. * Fix typo. * Split out into StringEscapeHandler. * Windows shell does handle removing quotes (so remove code to remove them). * Handle unescaping if not initiating using the shell. * Slight improvement around shell like decoding. * Simplify command extraction. * Add shared-library category type. * Fix bug in command extraction. * Typo in transcendental category. * Enable unit-test on in smoke test category. * Make parsing failing output as a failing test. * Fixes for transcendental tests. Disable tests that do not work. * Changed category parsing. * Removed the TestResult parameter from _gatherTestsForFile. Made testsList only output. * Remove testing if all tests were disabled. * Make args of CommandLine always unescaped. * Add category. * Don't need escaping on unix/linux. * Remove some no longer used functions. * Add requireSMVersion to CUDAExtensionTracker. * half-calc.slang now works for CUDA. * bit-cast-16-bit works on CUDA. * WIP handling of CUDA vector<half> types. * Half swizzle CUDA. * Half vector test. * Fix swizzle half bug. * Fix compilation issue with narrowing to Index. * Add unary ops. * Add some vector scalar maths ops. * Add half vector conversions for CUDA. * Fix erroneous comment. * Support for half comparisons. * First pass test for half compare. * Fix bug in CUDA specialized emit control. Updated tests to have pre and post inc/dec. * Removed unneeded parts of the cuda prelude. * Half structured buffer works on CUDA. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	04 May 2021, 20:24:51 UTC
dc571f1	Yong He	04 May 2021, 19:53:27 UTC	Update gfx getting started doc (#1832)	04 May 2021, 19:53:27 UTC
a342080	Nathan V. Morrical	04 May 2021, 19:52:58 UTC	Add support for RaytracingAccelerationStructure type for PTX targets (#1831) * enabling command line compiler to output PTX with multiple entry points. * adding some simple optix intrinsics to slang * Now handling the kIROp_RaytracingAccelerationStructureType for CUDA. Source seems to generate correctly, but accels are always optimized out of resulting PTX due to no trace calls * fixing unnecessary diff with master * allowing unhandled untyped buffers to fall through to super::calcTypeName Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	04 May 2021, 19:52:58 UTC
1c64316	jsmall-nvidia	04 May 2021, 18:44:20 UTC	More CUDA Half support (#1833) * #include an absolute path didn't work - because paths were taken to always be relative. * Split out StringEscapeUtil. * Added StringEscapeUtil. * Fix typo in unix quoting type. * Small comment improvements. * Try to fix linux linking issue. * Fix typo. * Attempt to fix linux link issue. * Update VS proj even though nothing really changed. * Fix another typo issue. * Fix for windows issue. Fixed bug. * Make separate Utils for escaping. * Fix typo. * Split out into StringEscapeHandler. * Windows shell does handle removing quotes (so remove code to remove them). * Handle unescaping if not initiating using the shell. * Slight improvement around shell like decoding. * Simplify command extraction. * Add shared-library category type. * Fix bug in command extraction. * Typo in transcendental category. * Enable unit-test on in smoke test category. * Make parsing failing output as a failing test. * Fixes for transcendental tests. Disable tests that do not work. * Changed category parsing. * Removed the TestResult parameter from _gatherTestsForFile. Made testsList only output. * Remove testing if all tests were disabled. * Make args of CommandLine always unescaped. * Add category. * Don't need escaping on unix/linux. * Remove some no longer used functions. * Add requireSMVersion to CUDAExtensionTracker. * half-calc.slang now works for CUDA. * bit-cast-16-bit works on CUDA. * WIP handling of CUDA vector<half> types. * Half swizzle CUDA. * Half vector test. * Fix swizzle half bug. * Fix compilation issue with narrowing to Index. * Add unary ops. * Add some vector scalar maths ops. * Add half vector conversions for CUDA. * Fix erroneous comment.	04 May 2021, 18:44:20 UTC
7d52d3b	Tim Foley	04 May 2021, 17:36:57 UTC	Cleanup work on D3D12 shader object static specialization (#1830) * Cleanup work on D3D12 shader object static specialization This builds on PR #1829 in a small way (because that PR adds `getStride()` to Slang type layout reflection). The only relevant changes here are in the `render-d3d12.cpp` file. The basic idea here is to clean up the D3D12 path to be more in line with the cleanups made for D3D11 and Vulkan. The way that D3D12 shader parameter binding goes through a root signature means that some of the details that were required for those APIs (in particular, tracking both "primary" and "pending" offsets during multiple steps) are not required for the critical-path binding stuff on D3D12. There is some subtlety to the handling of the "ordinary" data buffer in the `bindAsValue()` case that I don't like, and that I'm not 100% confident I've gotten right. We may find that we have to revisit that logic as we add more tests. * fixup	04 May 2021, 17:36:57 UTC
1a4a513	jsmall-nvidia	30 April 2021, 20:51:25 UTC	Preliminary CUDA half maths (#1827) * #include an absolute path didn't work - because paths were taken to always be relative. * Split out StringEscapeUtil. * Added StringEscapeUtil. * Fix typo in unix quoting type. * Small comment improvements. * Try to fix linux linking issue. * Fix typo. * Attempt to fix linux link issue. * Update VS proj even though nothing really changed. * Fix another typo issue. * Fix for windows issue. Fixed bug. * Make separate Utils for escaping. * Fix typo. * Split out into StringEscapeHandler. * Windows shell does handle removing quotes (so remove code to remove them). * Handle unescaping if not initiating using the shell. * Slight improvement around shell like decoding. * Simplify command extraction. * Add shared-library category type. * Fix bug in command extraction. * Typo in transcendental category. * Enable unit-test on in smoke test category. * Make parsing failing output as a failing test. * Fixes for transcendental tests. Disable tests that do not work. * Changed category parsing. * Removed the TestResult parameter from _gatherTestsForFile. Made testsList only output. * Remove testing if all tests were disabled. * Make args of CommandLine always unescaped. * Add category. * Don't need escaping on unix/linux. * Remove some no longer used functions. * Add requireSMVersion to CUDAExtensionTracker. * half-calc.slang now works for CUDA. * bit-cast-16-bit works on CUDA. * WIP handling of CUDA vector<half> types. * Half swizzle CUDA. * Half vector test. * Fix swizzle half bug. * Fix compilation issue with narrowing to Index. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	30 April 2021, 20:51:25 UTC
c45f368	Tim Foley	30 April 2021, 18:36:42 UTC	Clean up the way we stride over sub-object ranges for D3D11 and Vulkan (#1829) This change applies to the case where we have a sub-object range with more than one object in it (so arrays of constant buffers, parameter blocks, or existentials). It is worth noting that we don't really seem to have any tests covering this case right now, so it is entirely possible that things are busted already, and/or that this change doesn't work the way I think it does. When we go to actually bind the state from a sub-object range into the pipeline, we care about: * The offset to where the first object should be written * The "stride" between consecutive objects We were already capturing the offset information from Slang reflection data, and the same was true for the part of the offset that pertains to "pending" ordinary data. For other cases, though, we were manually incrementing the offset by values computed manually in `gfx` for Vulkan, and we were just skipping the offsetting step entirely for D3D11. With this change we extract more complete stride information from the Slang reflection data and use that instead of ad hoc computations. Hopefully this data is correct, and if it isn't we can consider whether it needs fixing at the Slang reflection level rather than in `gfx`. This change doesn't apply to the OpenGL back-end (which hasn't been updated to match the new approach to static specialization at all) or to the D3D12 back end (which has been updated a bit, but probably needs other cleanup work to be done first to bring it in line).	30 April 2021, 18:36:42 UTC
37a3417	Tim Foley	29 April 2021, 22:27:24 UTC	Update gfx back-ends to handle static specialization (#1826) * Update gfx back-ends to handle static specialization The main goal here is to make the D3D11, D3D12 and Vulkan back-ends support static specialization of interface types in the case where the data for the type won't "fit" in the pre-allocated space for existential values. This includes all cases where the concrete type being specialized to has resources/samplers/etc., as well as any cases where its ordinary/uniform data exceeds the space available. (Note that the CPU and CUDA targets don't need this work since they can (in theory) support arbitrary-size data in the fixed-size existential payload by using pointer indirection. Actually supporting indirection in those cases should be a distinct change) The Slang compiler already performs layout for programs that have this kind of data that doesn't "fit," and it lays them out using an idea of "pending" type layouts. Basically, a type that contains some amount of specialized interface-type fields will produce both a "primary" type layout that just covers the data for the unspecialized case, as well as "pending" type layout that describes the layout for all the extra data needed by specialization. When laying out a `ConstantBuffer<X>` or `ParameterBlocK<X>` ("CB" or "PB"), the front-end will try to place as much of that "pending" data into the layout of the buffer/block itself as is possible. That means that both CBs and PBs will be able to allocate trailing bytes for any ordinary data in the "pending" layout. PBs will be able to allocate any trailing resources/samplers into their layout, but for CBs they will spill out to be part of the pending layout for the buffer itself. In order for the back-ends to properly handle pending data, they need to either assume the exact layout rules used by the front-end and try to reproduce them (e.g., by iterating over binding ranges and sub-objects in the exact same order that front-end layout would enumerate them), or they need to respect the reflection information produced by the front-end. This change takes the latter approach, trying to make only minimal assumptions about the layout rules being used. This choice is motivated by wanting to decouple the `gfx` implementation from the compiler front-end, especially insofar as this work has made me question whether the current layout rules are the best ones possible. A common theme across all the implementations is to have a fixed-size type that can represent "binding offsets" for the chosen back-end. The offset type has fields that depend on the API-specific way bindings are indexed; e.g., for D3D11 it has offsets for CBV, SRV, UAV, and sampler bindings. This fixed-size offset type can be filled in based on Slang reflecton information, and then used to compute derived offsets with just a few add operations. The simple offset type for each API is then extended to produce an offset type that includes both the offsets for "primary" data and also the offsets for "pending" data. Most logic that traffics in offsets doesn't have to know about this more complicated representation. Making consistent use of these offsets required that I pretty much rewrite the logic that actually applies shader objects to the API state. Doing so might be lowering the efficiency of the system in the near term, but the increase in clarity was important for getting the work done, and it seems like it will also be important if/when we start trying to perform special-case optimizations around root and entry-point parameter setting. While there are many API-specific differences, we can identify a repeated pattern where many steps, whether applying parameters to the pipeline stage or constructing signatures / layouts, can be broken down into three main operations on `ShaderObject`s or their layouts: * `AsValue()` is the core operation, and is the one used for the `ExistentialValue` case most of the time. It ignores the ordinary data in the object, and instead processes all nested binding ranges (for resources/smaplers) and sub-objects. `AsConstantBuffer()` handles the `ConstntBuffer<X>` case, by dealing with the implicit buffer for ordinary data (if it is needed) and then delegates to the `AsValue()` case. * `AsParameterBlock()` handles the `ParameterBlock<X>` case, by allocating/preparing/etc. any descriptor tables/sets that would be required for the current object/layout and then delegating to `AsConstantBuffer()` to do the rest The idea is that by having the parameter block case delegate to the constant buffer case, which delegates to the value/existential case, we can streamline a lot of the logic so that it doesn't seem quite as full of special cases. Note: When preparing this pull request I spent a reasonable amount of time trying to clean up the D3D11 and Vulkan implementations, so they are probably the easiest to read and understand when it comes to the new code. Doing the cleanup work also helped to work out some weird corner case bugs/issues. In contrast, the D3D12 path hasn't had as much attention given to cleanliness and comments, so it really needs some attention down the line to get things into a state that is easier to understand. * fixup: remove debugging code spotted in review	29 April 2021, 22:27:24 UTC
aba8ec6	Yong He	29 April 2021, 21:44:09 UTC	Update nav.html	29 April 2021, 21:44:09 UTC
245352d	Yong He	29 April 2021, 21:34:09 UTC	Update document subtitle color (#1828)	29 April 2021, 21:34:09 UTC
23d0c89	Yong He	29 April 2021, 21:20:20 UTC	Add gfx user's guide. (#1824) * Add gfx user's guide. * Add getting started chapter in gfx-guide * Fixes * Fix * Polishing doc template	29 April 2021, 21:20:20 UTC
2482271	Yong He	29 April 2021, 21:19:51 UTC	`gfx` DebugCallback and debug layer. (#1822) * `gfx` DebugCallback and debug layer.	29 April 2021, 21:19:51 UTC
ad6f307	jsmall-nvidia	29 April 2021, 19:45:25 UTC	Simplify CommandLine by removing Escaping (#1825) * #include an absolute path didn't work - because paths were taken to always be relative. * Split out StringEscapeUtil. * Added StringEscapeUtil. * Fix typo in unix quoting type. * Small comment improvements. * Try to fix linux linking issue. * Fix typo. * Attempt to fix linux link issue. * Update VS proj even though nothing really changed. * Fix another typo issue. * Fix for windows issue. Fixed bug. * Make separate Utils for escaping. * Fix typo. * Split out into StringEscapeHandler. * Windows shell does handle removing quotes (so remove code to remove them). * Handle unescaping if not initiating using the shell. * Slight improvement around shell like decoding. * Simplify command extraction. * Add shared-library category type. * Fix bug in command extraction. * Typo in transcendental category. * Enable unit-test on in smoke test category. * Make parsing failing output as a failing test. * Fixes for transcendental tests. Disable tests that do not work. * Changed category parsing. * Removed the TestResult parameter from _gatherTestsForFile. Made testsList only output. * Remove testing if all tests were disabled. * Make args of CommandLine always unescaped. * Add category. * Don't need escaping on unix/linux. * Remove some no longer used functions.	29 April 2021, 19:45:25 UTC
972bd3c	jsmall-nvidia	29 April 2021, 13:01:46 UTC	Support for escaped paths in tools (#1823) * #include an absolute path didn't work - because paths were taken to always be relative. * Split out StringEscapeUtil. * Added StringEscapeUtil. * Fix typo in unix quoting type. * Small comment improvements. * Try to fix linux linking issue. * Fix typo. * Attempt to fix linux link issue. * Update VS proj even though nothing really changed. * Fix another typo issue. * Fix for windows issue. Fixed bug. * Make separate Utils for escaping. * Fix typo. * Split out into StringEscapeHandler. * Windows shell does handle removing quotes (so remove code to remove them). * Handle unescaping if not initiating using the shell. * Slight improvement around shell like decoding. * Simplify command extraction. * Add shared-library category type. * Fix bug in command extraction. * Typo in transcendental category. * Enable unit-test on in smoke test category. * Make parsing failing output as a failing test. * Fixes for transcendental tests. Disable tests that do not work. * Changed category parsing. * Removed the TestResult parameter from _gatherTestsForFile. Made testsList only output. * Remove testing if all tests were disabled. * Fix typo. * Disable path canonical test on linux because CI issue.	29 April 2021, 13:01:46 UTC
541d1ca	Nathan V. Morrical	27 April 2021, 17:36:00 UTC	adding some simple optix intrinsics to slang (#1817)	27 April 2021, 17:36:00 UTC
6928393	Nathan V. Morrical	27 April 2021, 07:57:43 UTC	enabling command line compiler to output PTX with multiple entry points. (#1816) Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>	27 April 2021, 07:57:43 UTC

Newer
Older