Revision history - None - origin: https://github.com/shader-slang/slang

visit type:

Revision	Author	Date	Message	Commit Date
2765861	Tim Foley	08 March 2021, 21:05:56 UTC	Add GLSL support for SV_InnerCoverage (#1740) This was a fairly straightforward addition once I found the correct GLSL extension spec to use.	08 March 2021, 21:05:56 UTC
fc9968d	Yong He	08 March 2021, 18:01:20 UTC	Refactor window library. (#1739) * Refactor window library. * Fix project file * Fix warnings.	08 March 2021, 18:01:20 UTC
95ca939	Yong He	08 March 2021, 03:31:08 UTC	Bug fix in window creation. (#1738)	08 March 2021, 03:31:08 UTC
e962f1a	Tim Foley	05 March 2021, 23:02:44 UTC	Add Vulkan/SPIR-V support for TraceRayInline() (#1737) For the most part, this translation is straightforward because the `GL_EXT_ray_query` extension is well aligned with the DXR 1.1 `RayQuery` feature. Many function map one-to-one from one extension to the other. A few notable details: * The equivalent of the `RayQuery<Flags>` type is non-generic in GLSL, and the GLSL path previously didn't have support for trying to look up an intrinsic type name on an IR type declaration, so that required some tweaks to the emit logic. * All the GLSL functions are free functions instead of member functions, but our IR doesn't recognize that distinction anyway * The main `TraceRayInline()` call is the one that took the most tweaking, just because it takes a `RayDesc` structure for D3D/HLSL but takes individual vector sand scalars for VK/GLSL. The approach here is a standard one for how we manage this stuff in the stdlib (and I wanted to avoid adding even more `$` magic for intrinsics). * For several other calls, the HLSL API had distinct `Candidate*()` and `Committed()` calls that return information about a candidate hit vs. the one committed into the query. In contrast, the GLSL API uses a single call that takes an additional "must be compile-time constant" `bool` parameter to select between the two behaviors. This is even the case for one call that basically returns a value of a different `enum` type depending on the state of that `bool`. The D3D API model here seems almost strictly better and I have no idea why the GLSL extension was defined this way. Because both the `GL_EXT_ray_query` and `GL_EXT_ray_tracing` extensions declare the `accelerationStructureEXT` type, we can no longer infer what extension is supposed to be used based only on the presene of such a type. The logic right now is a bit slippery, because in theory a program that declares an acceleration structure but never traces into it could end up getting a compilation error now. We will have to see if that corner case comes up in practice. :( The one big detail that is looming after doing this work is that both the HLSL and GLSL exposures of ray queries are extremely "slippery" about the actual identity of queries (e.g., when is one query a copy of another, vs. just being a new variable that references the existing query). Somehow queries get their identity from the original declaration, and as such our "default constructor" approach to them seems semanticay correct, but the whole thing is kind of slippery at a foundational level and I don't know how to fix it with the API as defined. Oh well; just something to keep an eye on. Co-authored-by: Yong He <yonghe@outlook.com>	05 March 2021, 23:02:44 UTC
860d17b	jsmall-nvidia	05 March 2021, 19:34:46 UTC	Doc tooling improvements (#1734) * #include an absolute path didn't work - because paths were taken to always be relative. * Split out AST 'printing'. * Replace listener with List<Section> * Section -> Part. * Kind -> Type Flags -> Kind for ASTPrinter::Part * Improve comments around ASTPrinter. * toString -> toText on Val derived types. toText appends to a StringBuilder. * Added toSlice free function. Added operator<< for Val derived types. Use << where appropriate in doing toText. * More work at mark down output. * Fill in sourceloc for enum case. Add more sophisticated location determination for EnumCase. Refactored documentation output into DocMarkdownWriter. * Improvements for sig output. * Split up slang-doc into extractor and writer. * WIP generic support for doc support. * Some refactoring to make DocExtractor have potential to be used without Decls. * Made doc extraction work without Decls. * Output generic parameters. * Add generic parameter extraction. * Added writing variables. * Add an interface test. * Fix toArray. * Support for extensions, and inheritance. * Disable the doc test. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	05 March 2021, 19:34:46 UTC
dc71108	Yong He	05 March 2021, 18:58:08 UTC	Cache stdlib when creating global session. (#1736) * Cache stdlib when creating global session. * Fix * Fix	05 March 2021, 18:58:08 UTC
a5ac499	Yong He	05 March 2021, 00:25:58 UTC	Refactor `gfx` to surface `CommandBuffer` interface. (#1735) * Refactor `gfx` to surface `CommandBuffer` interface. * Fixes. * Fix code review issues, and make vulkan runnable on devices without VK_EXT_extended_dynamic_states. * Update solution files * Move out-of-date examples to examples/experimental Co-authored-by: Yong He <yhe@nvidia.com>	05 March 2021, 00:25:58 UTC
13ff0bd	Tim Foley	03 March 2021, 19:45:39 UTC	Add GLSL/SPIR-V support got GetAttributeAtVertex (#1733) This change allows varying fragment shader inputs to be declared in a way that allows the `GetAttributeAtVertex` operation to compile to valid code for both D3D and GLSL/SPIR-V/Vulkan. The key is that rather than just use ordinary `nointerpolation`-qualified inputs the code must declare these varying inputs with a new `pervertex` qualifier that marks them as only being usable with `GetAttributeAtVertex`. The `pervertex`-tagged inputs then translate to GLSL inputs using the `pervertexNV` qualifier Note that this change does not include any enforcement of the requirements around how these qualifiers are used (and the compiler doesn't have enforcement for the existing operations like `EvaluateAttributeAtCentroid`). The underlying problem is that the inerpolation-mode qualifiers and explicit interpolation functions in HLSL constitute a kind of rate-qualified type system, but without any systematic rules. It seems wasteful to encode a bunch of ad hoc rules for this stuff as special cases in the compiler when the clear right answer is to implement a systematic approach to rates.	03 March 2021, 19:45:39 UTC
d6ae671	Tim Foley	02 March 2021, 23:46:28 UTC	Clean up declarator handling during source emit (#1732) This change tidies up some code related to the handling of declarators for the purpose of "unparsing" types into C-like declarations. The big change is that the `EDeclarator` type is changed to `DeclaratorInfo` and now has a bit of a subtype hierarchy under it rather than just using a `union`. The declarations have been moved to the header for CLikeSourceEmitter` so that they can be used by subclasses. I also removed the `IRDeclaratorInfo` type that was being declared but never actually used, and moved the case for pointers from that type into the main `EDeclarator`/`DeclaratorInfo`.	02 March 2021, 23:46:28 UTC
c2653ba	jsmall-nvidia	02 March 2021, 22:03:16 UTC	Fix issue with long identifier names in GLSL output (#1731) * #include an absolute path didn't work - because paths were taken to always be relative. * First pass at handling 'names' that are too long in GLSL output. * Test to check functionality with very long func name. * Add access a long names buffer. * Fix typo in assert. Fix issue with coercion error for 1.0f / 0x7fffffff Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	02 March 2021, 22:03:16 UTC
b81e8d4	Tim Foley	02 March 2021, 20:52:34 UTC	Add command-line control over SPIR-V version (#1730) * Add command-line control over SPIR-V version By default the Slang compiler policy is usually to produce output with the fewest dependencies possible. If input code can be encoded as SPIR-V 1.0, that is what we will use by default. The catch here is that in some cases later SPIR-V versions introduced improvements to the encoding that can affect performance (e.g., around large global arrays of constants), so that a user might explicitly want to require a newer SPIR-V version (restricting the driver versions their code can work on) in the hopes of seeing better performance. This change uses the system of capabilities that was previously introduced so that an option like `-profile glsl_450+spirv_1_5` can be used to explicitly request a specific SPIR-V version. Consistent with the existing implementation, the requested version will be taken as a minimum, and the final version might be higher based on other requirements (e.g., use of intrinsic functions that require a higher version). The test case included here is a little iffy in terms of long-term maintanenace. It relies on having both a `.slang` file and a `.glsl` file that we compile with the same options and then compare the SPIR-V, but that means there is no direct testing that the output SPIR-V actually uses the necessary version. If we break the inference of SPIR-V versions for both the regular and pass-through paths at once, this test won't flag the problem. A better test is probably needed soon. This change only adds support for controlling the SPIR-V version via capabilities specified via the command line or API. It would be nice to a future change to allow something like `[require(spirv_1_5)]` to be added to an entry point function to allow the user to embed their expectation/requirement into the source code. * fixup: clang warning	02 March 2021, 20:52:34 UTC
837a155	jsmall-nvidia	01 March 2021, 20:37:46 UTC	Doc improvements (#1729) * #include an absolute path didn't work - because paths were taken to always be relative. * Split out AST 'printing'. * Replace listener with List<Section> * Section -> Part. * Kind -> Type Flags -> Kind for ASTPrinter::Part * Improve comments around ASTPrinter. * toString -> toText on Val derived types. toText appends to a StringBuilder. * Added toSlice free function. Added operator<< for Val derived types. Use << where appropriate in doing toText. * More work at mark down output. * Fill in sourceloc for enum case. Add more sophisticated location determination for EnumCase. Refactored documentation output into DocMarkdownWriter. * Improvements for sig output.	01 March 2021, 20:37:46 UTC
b3501ad	Tim Foley	26 February 2021, 17:43:03 UTC	Shader object specialization work-in-progress (#1728) * Shader object specialization work-in-progress The big change here is in the `setObject()` implementations, where we now take write the witness table ID and data for the value being assigned in both the CUDA and graphics-API paths (it is possible the code could be shared...). The logic for deciding whether a value "fits" in the existential value payload should actually be correct here, since it uses the reflection data. The other relevant change is that the logic for writing out the ordinary/uniform data for a shader object on the graphics-API path has been updated so that it only allocates the GPU buffer after it knows the specialized layout, and can thus allocate space for any extra parameter data that wasn't in the original layout but got added by specialization. There is some inactive code in place that tries to sketch how the implementation should handle writing the data of sub-objects for interface-type fields into the appropriate areas of the allocated buffer for a parent object, but that is stubbed out for now pending implementation of the relevant reflection information. This change also introduces logic in the graphics-API path to create a specialized layout for a shader object on-demand (so that it will only be created after the specialization arguments are known or can be inferred). The implementation needs to treat ordinary shader objects and root shader objects differently because the Slang API handles specialization differently for ordinary types vs. `IComponentType`s. Some notes and caveats: * The CUDA path doesn't need to compute specialized layouts the way the graphics-API path does because layout doesn't change based on specialization for that path (just as it won't for the CPU path) * This code just skips over the RTTI field in existential values because it seems that we currently aren't using it in generated code. * We are completely missing the logic for recursively writing the resource ranges of sub-objects bound to interface-type fields into the descriptor set(s) of the parent object. The missing link there is reflection API support, just as it is for filling in the ordinary/uniform data. We need a way to get the binding range offset (and binding array stride) for the "pending" data of a specialized interface-type field. * The logic for computing specialization arguments based on the shader objects bound to interface-type fields has a lot of holes. Some of the indexing math is flat-out incorrect, and it also doesn't make any attempt to handle sub-object ranges with more than one element in them. I tweaked some of the code there to make it more correct, but that doesn't mean it is actually correct at this point. * The logic for computing a specialized `IComponentType` for a `ProgramVars` in the graphics-API path seems to have a lot of overlap with `maybeSpecializeProgram()`, so we should look into ways to avoid the duplication over time. * clang error fix	26 February 2021, 17:43:03 UTC
af63ee4	Tim Foley	25 February 2021, 03:22:31 UTC	Partial fix for macro expasnion of token pastes (#1727) The underlying problem here requires that we have an object-like macro with an expansion that starts with a non-identifier token: ``` ``` Then we need a function-like macro that uses a token paste in a way that can expand to that object-like macro: ``` ``` Finally, for the specific case a user ran into, we need to invoke that function-like macro in the context of a preprocessor conditional expression: ``` // ... #error "unimplemented" ``` The way a problem manifest is that the preprocessor logic that handles conditional expressions tries to "peek" one token ahead and see what is coming, and while the peeking logic handles macro expansion it does not handle token pasting right now. That means that the peek operation sees `MY_FEATURE` and assumes that it is seeing an identifier in a preprocessor conditional that doesn't have a macro expansion. The logic then goes on to read the token, but what it gets back is not an identifier, and is instead the numeric literal token `1`, because the reading logic handles token pasting. The quick fix I applied here is to make the logic that deals with preprocessor conditionals go ahead and automatically consume a token from the input, and then decide what to do based on that token, so that it always makes use of the reading logic that handles token pasting. The lingering problem is that we still have cases in the preprocessor that use the peeking logic which doesn't handle pasting, and we might find that those cases have reason to want the same kind of expansion behavior we needed here. A more systematic fix would be to have the peeking logic automatically handle token pasting as well as macro expansion, but doing so would be a more complicated change because detecting the `##` when peeking ahead requires two tokens of lookahead, and our current implementation only assumes we can support one. Co-authored-by: Yong He <yonghe@outlook.com>	25 February 2021, 03:22:31 UTC
9b7a007	Yong He	24 February 2021, 23:43:43 UTC	Explicit swapchain interface in `gfx`. (#1726) * Explicit swapchain interface in `gfx`. * Correctly return nullptr when `IRenderer` creation failed. * Fix crashes on CUDA tests. * Cleanups.	24 February 2021, 23:43:43 UTC
d66b307	Tim Foley	24 February 2021, 16:21:37 UTC	Add support for GetAttributeAtVertex for D3D (#1725) This operation was added along with the `SV_Barycentrics` system-value input, and allows for a `nointerpolation` varying input to a fragment shader to be fetched at a specific vertex index within the primitive that is causing the fragment shader to be invoked. This change adds support for the new operations in the standard library, and also includes a test case to make sure that we emit it correctly when producing HLSL/DXIL. This change also includes a small bug fix to our emission logic for function parameters so that we properly emit layout-related attributes for varying parameters declared directly on an entry point. (Note that most attribute end up being declared in `struct` types in existing HLSL shaders, and our IR passes produce only global variables for attributes on GLSL; the only case this affects is inidividual scalar/vector attributes declared declared as entry-point parameters, when outputting HLSL) Note that this change only adds support for the new function on the HLSL/DXIL path, and doesn't yet add any cross-compilation support for GLSL/SPIR-V. The reason for this is that the equivalent GLSL feature(s) appear to use a different model to the HLSL version, and we need to invent a suitable approach to align them to make portable code possible.	24 February 2021, 16:21:37 UTC
55a5ccc	jsmall-nvidia	23 February 2021, 17:36:46 UTC	Documentation markup extraction (#1724) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP extracting source documentation. * WIP doc extraction. * More stuff around doc markup extraction. * More WIP around doc extraction. * Fix some indexing issues. * Initial doc extraction working. * Renaming of types in markup extraction process. * Extracting markup content. Removing indenting. Other fixes and improvements around document tools. * WIP support for documentation system. * Remove some commented out sections. * Remove some comments that no longer apply. * Improvements around SourceFile - such that more granularity around line ops. Made some functionality explicitly work without source. Improved Doc types nameing.	23 February 2021, 17:36:46 UTC
4bf01b0	Tim Foley	23 February 2021, 09:47:19 UTC	Some ad hoc parser fixes (#1723) The `AdvanceIfMatch()` method was introduced to the parser as a way to avoid infinite loops when parsing nested list structures (e.g., `()`-enclosed parameter lists). The basic idea is that it tries to detect if we have scanned "too far" looking for a closing token, and reports a match to whatever logic was doing the looping to break the statemate. Unfortunately, the `TryRecoverBefore` logic was changed at some point so that it doesn't necessarily advance any tokens at all, because we generally don't want to skip over a `}` while searching for a `)`. As a result, we could still end up in an infinite loop where we didn't consume any additional tokens as part of recovery, but wouldn't bail out of the search for a match. This change tries to introduce a slightly more systematic setup where `AdvanceIfMatch` is now parameterized on a type of matched token pair (not just the closing token), and each such matched token pair introduces a list of tokens where if we see them as our lookahead we should bail out (e.g., when looking for a `)` we should give up the search upon seeing a `}`). After installing that fix I found that my simple test case still gave a surprising error because when mistakenly parsing a function body the parser would look for a `{` and then a `}` to close the body. The search for a closing `}` could accidentally consume a `}` meant for an outer scope, and lead to a cascading failure. I madea quick fix to the parsing of block statements so that we don't look for a closing `}` if we never had an opening `{`, but that isn't really a systematic solution like we truly need. For now, these fixes will avoid the infinite-loop case, and should give a better diagnostic in the case a user ran into, but we need to take time to do some more top-down work on the parser sooner or later.	23 February 2021, 09:47:19 UTC
025c0ed	Tim Foley	22 February 2021, 23:07:26 UTC	Add basic support for fragment shader interlock (FSI) (#1722) Both D3D "rasterizer ordered views" (ROVs) and GLSL "fragment shader interlock" (FSI) are aimed at the same basic use case: they allow for fragment shaders to contain operations that require mutual exclusion and/or deterinistics ordering between fragment shader invocations that affect the same framebuffer coordinates. The language-level exposure of the features varies greatly between the two API families, though: * ROVs define an implicit ordering and mutual exclusion constraint: certain resoure parameters are marked as `RasterizerOrdered`, and reads/writes to these resources must be sequences as if fragment-shader invocations ran in sequential order for each pixel. * FSI defines paired begin/end functions that mark a critical section of code. All memory operations in the critical section must be sequences as if fragment-shader invocations ran in sequential order for each pixel. In order to make this model tractable, only a single critical section is allowed per fragment shader, and the begin/end must appear at the top level of the shader entry point function (not under control flow or after a possible conditional `return`. The simplest way for Slang to support portable programs that run across both API families is to insist that code that cares about these ordering guarantees must use both mechanisms, and then each of them will only affect the API that cares about it. Slang already supports ROV resource types, and already lowers them to plain textures for GLSL/SPIR-V. This change adds the missing feature of a begin/end function pair for FSI, which will map to empty functions on non-GLSL targets.	22 February 2021, 23:07:26 UTC
e1e4220	Tim Foley	19 February 2021, 18:32:19 UTC	Add a chapter on target platforms (#1720) * Add a chapter on target platforms The primary goals of this chapter are: * Make users aware of just how many different ways of handling things there are across targets. If a user leaves this chapter thinking "how in the world can you abstract over all these differences?", then we have done our job, because they are primed to understand why layout and parameter binding are necessarily complicated. * Help users to understand/recall the relevant capabilities and restrictions of the platforms they care about most. If somebody only cares about D3D12 and Vulkan, I want them to leave with a detailed understanding of how those two differ so they can understand the specifics of where the layout and parameter-binding algorithms have to treat those targets differently. All of this could conceptually be just a background section in the layout and parameter-binding chapter, but putting it off in its own chapter avoids that one taking forever to actually get where it is going. * Typos	19 February 2021, 18:32:19 UTC
5f7dc28	Yong He	19 February 2021, 18:11:01 UTC	Make gfx library visible to external user. (#1719) * Make gfx library visible to external user. * Fixup	19 February 2021, 18:11:01 UTC
22fe1df	Yong He	18 February 2021, 02:46:14 UTC	Fix typo in user guide.	18 February 2021, 02:46:14 UTC
b1e376f	Tim Foley	18 February 2021, 02:42:23 UTC	Streamline shader object creation (#1717) This change kind of rolls together two different simplifications: 1. The `createShaderObject()` shouldn't really need to take an `IShaderObjectLayout` because it could just take the `slang::TypeLayoutReflection` instead and create the shader-object layout behind the scenes. 2. For that matter, it needn't take a `slang::TypeLayoutReflection` either, becaues it could just take a `slang::TypeReflection` and query the layout of that type behind the scenes. The combination of these two changes means: * `IShaderObjectLayout` is gone from the public API, as is `createShaderObjectLayout()` * `createShaderObject()` directly takes a `slang::TypeReflection` and allocates a shader object of that type The result is simpler and more streamlined application code. Note that under the hood the implementation still has shader-object layouts, using the `ShaderObjectLayoutBase` class. A few locations had to change to use `RefPtr`s instead of `ComPtr`s now that the class is no longer a public COM-lite API type. The hope is that this change makes it easier to allocate/cache layouts for things like specialized types "under the hood," as is needed to implement parameter setting for static specialization.	18 February 2021, 02:42:23 UTC
bdb0c0b	Yong He	18 February 2021, 01:41:57 UTC	Further documentation on Slang specific features (#1716) Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	18 February 2021, 01:41:57 UTC
62a0193	Tim Foley	18 February 2021, 00:53:17 UTC	Use CPU memory for shader object ordinary data (#1714) This change makes it so that the shared shader object implementation across graphics APIs (everything except CUDA and CPU) uses a host-memory buffer to store ordinary (aka "uniform") data while the shader object is being set up / modified, and then allocates and initializes a GPU-memory buffer for the data on-demand once setup is complete. This choice is a necessary step for supporting interface/existential-type fields in the presence of static specialization, because any fixed-size GPU buffer we would try to allocate at the time an object is first created might not turn out to be large enough if static specialization must handle a concrete type that doesn't "fit" into the fixed-size space reserved for an existential value (resulting in the value having to be placed in an overflow region outside the original object). This change does not include any of the work related to actually laying out existential-type fields in this fashion. It instead just focuses on changing when and where the GPU memory allocation is performed to one that is more appropriate for those subsequent changes.	18 February 2021, 00:53:17 UTC
360d4f7	jsmall-nvidia	18 February 2021, 00:04:48 UTC	More #line improvements (#1713) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP: First pass in supporting output of line error information. * Add support for lexing to better be able to indicate SourceLocation information. * Fix lexer usage in DiagnosticSink in C++ extractor. * Update diagnostics tests to have line location info. * Fixed test expected output that now have source location information in them. * Better handling of tab. * Fix test expected results for tabbing change. * DiagnosticLexer -> DiagnosticSink::SourceLocationLexer Added line continuation tests. * Fix typo. * Added String::appendRepeatedChar * Change to rerun tests. * Added source locations to IR dumping. * Output column for IR dump source loc. * Add support for closing brace location to AST. Use closing brace location in lowering when adding return void. * Set the source location through SourceLoc - simplifies identifying if current loc is valid. * Copy terminator sloc. * Test for improved #line handling. * Made writer the last parameter for dumpIR. Small improvements to comments. * Disable sloc output on dump IR by default. * Fix issue with #line and inlining. * Fix for output with improved #line output. * Small comment change - mainly to kick off TC build. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	18 February 2021, 00:04:48 UTC
e59aee1	Yong He	17 February 2021, 23:09:09 UTC	Add `SampleGrad` overload for lod clamp. (#1711) * Add `SampleGrad` overload for lod clamp. * Fix gfx to run the test on vulkan. * Whitespace change to trigger CI build * remove presentFrame call in render-test Co-authored-by: Yong He <yhe@nvidia.com> Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	17 February 2021, 23:09:09 UTC
39975b2	Tim Foley	16 February 2021, 22:03:39 UTC	Fixes to get shader-object example working on CUDA (#1708) The purpose of these changes is to make the `shader-object` example work correctly on CUDA. Originally I had tried to add changes to the "flat" reflection information so that it introduced descriptor ranges to match the binding ranges it added for interface/existential-type fields. This approach helped the CUDA code that was using that information to try and compute uniform offsets for those fields, but it broke most of the other renderer back-ends. Instead, I removed the relevant asserts from `CUDAShaderObject::setObject()`. Note taht there are leftover changes from my edits to the flat reflection information, around how it handles "leaf" fields that consume multiple resource kinds. I believe that those changes are, on balance, "more correct" now than they were before, so I decided to leave them in. The other major fix here is to specialize the `CUDAShaderObject::setObject()` logic to handle the case of setting a shader object for a parameter that has interface type instead of a constant-buffer or parameter block. Mostly I just copy bytes from the child object into the parent object. There are a few caveats, though: * I am not writing the RTTI or witness-table information, so dynamic dispatch won't work. * I am assuming a hard-coded offset of 16 bytes for the any-value, which will work for now but is a bit too "magical" and might also break once we support conjunctions of interfaces with dynamic dispatch * I am assuming that the child value to be writen into the field will "fit" into the any-value area. We need some way to determine whether or not things fit dynamically (ideally using the reflection data), and adapt accordingly. * I had to add another method on the base CUDA shader object type to handle setting data using a device-memory pointr instead of a host-memory pointer * There's not a lot we can do about it, but in the case of assigning an ordinary `CUDAShaderObject` into an interface-type field of a `CUDAEntryPointShaderObject` we end up needing to perform a device->host memory copy, because the bytes of the value will have already been written to GPU memory, but need to be in GPU memory for the dispatch call. * The implementation I'm using here basically assumes that the child shader object must have been finalized before it gets plugged into the parent shader object. We haven't yet made a policy decision about that bit.	16 February 2021, 22:03:39 UTC
e474c4e	Tim Foley	16 February 2021, 19:48:21 UTC	Add an accessor for IRInst opcode (#1707) * Add an accessor for IRInst opcode This main changing is renaming `IRInst::op` over to `IRInst::m_op` and then adds an accessor `IRInst::getOp()` to read it. The rest of the changes are just changing use sites to `getOp` (or to `m_op` in the limited cases where we write to it). This work is in anticipation of a future change that might need to store an extra bit in the same field as the opcode. It seemed better to do this massive refactoring as a separate PR. * fixup	16 February 2021, 19:48:21 UTC
5777545	Yong He	12 February 2021, 23:01:45 UTC	Add associated type and generic value parameter doc section (#1706) * Add associated type and generic value parameter doc section * Typos and corrections.	12 February 2021, 23:01:45 UTC
e2096cf	Tim Foley	12 February 2021, 21:48:11 UTC	Initial support for DXR payload access qualifiers (#1705) This change adds initial support for a feature being proposed for inclusion in dxc: https://github.com/microsoft/DirectXShaderCompiler/pull/3171. The main features are: * A `[payload]` attribute that indicates which `struct` types are intended to be used as payloads. Consistent use of this attribute should mean that an application no longer needs to manually specify a maximum payload size when creating a ray-tracing pipeline. * `read(...)` and `write(...)` qualifiers which can be attached to fields of `struct` types (usually `[payload]`-attributed types) to indicate which ray tracing pipeline stages are allowed read/write access to that part of the payload. Use of these qualifiers should allow an implementation to optimize storage of ray payload elements across RT pipeline stages. The work in this change just adds basic parsing for these features, translation to matching IR decorations, and then emission of HLSL text based on those decorations. Notable gaps in this first change include: * No work is currently being done to validate access to ray payloads in RT entry points based on these qualifiers. * The stage names in `read(...)` and `write(...)` are not being validated, and are being stored in the IR as text. These should probably use the `Stage` enumeration in some fashion, but we would need to have a way to encode the additional `caller` pseudo-stage that the feature uses. * No work is currently being done to adjust or react to the chosen shader model when emitting HLSL code. We should either have these attributes force a switch to a higher shader model, or skip emission of these attributes if the chosen shader model / profile does not imply support for them. * No tests are currently included for this work, because tests would rely on using a custom `dxcompiler.dll` build with the new feature supported.	12 February 2021, 21:48:11 UTC
0dea127	Yong He	12 February 2021, 20:54:48 UTC	First part of interfaces and generics doc. (#1704) Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	12 February 2021, 20:54:48 UTC
0befc84	Tim Foley	12 February 2021, 20:53:56 UTC	Further documentation work (#1703) * Move around the conventional/convenience features chapters * Add a first draft of a section on compilation using `slangc` and the COM-lite API Co-authored-by: Yong He <yonghe@outlook.com>	12 February 2021, 20:53:56 UTC
a2401a6	Yong He	12 February 2021, 20:20:17 UTC	Support `bit_cast` between complex types. (#1702) * Support `bit_cast` between complex types. * Fix vs project file * Fix clang build error * fix * fix * Fix * FIx * Fix * Fix * Fix * Fix * Fix linux compile error Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	12 February 2021, 20:20:17 UTC
369279e	jsmall-nvidia	12 February 2021, 19:31:56 UTC	Diagnostic location highlighting (#1700) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP: First pass in supporting output of line error information. * Add support for lexing to better be able to indicate SourceLocation information. * Fix lexer usage in DiagnosticSink in C++ extractor. * Update diagnostics tests to have line location info. * Fixed test expected output that now have source location information in them. * Better handling of tab. * Fix test expected results for tabbing change. * DiagnosticLexer -> DiagnosticSink::SourceLocationLexer Added line continuation tests. * Fix typo. * Added String::appendRepeatedChar * Change to rerun tests. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	12 February 2021, 19:31:56 UTC
cd79bfb	Yong He	11 February 2021, 18:09:20 UTC	Add convenience features chapter in user-guide doc (#1699) * Fix getting started doc * Add convenience features chapter in user-guide doc Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	11 February 2021, 18:09:20 UTC
e1b1ce3	Tim Foley	11 February 2021, 00:35:52 UTC	Fix a bug in IR lowering (#1701) The underlying problem here is that our `SharedIRBuilder` (which currently owns the "global" value-numbering map) has a subtle invariant ("subtle" in the sense of "dangerous and bad"). The value-numbering map stores `IRInst`s for things like constants and types, and if those instructions end up getting modified or deleted (deleting an instruction currently runs its destructor but does not free the pool-allocated memory), then it is possible for the computed hash code for an instruction to no longer match what it was when it was inserted. The trigger in this case was a use of the `IRInst::removeAndDeallocate()` operation inside of the AST-to-IR lowering pass, which uses a single `SharedIRBuilder`. If that `removeAndDeallocate()` happens to apply to a value in the value-numbering map, then it risks breaking the next time the map gets rehashed. The short-term fix here is simple: never try to delete an instruction during IR lowering, even if it is known to be unused. Instead, we can rely on the subsequent DCE pass to eliminate the instruction. A longer-term fix here would involve fixing our entire strategy around value numbering. We know we need to do that, but that would be a big enough change that it couldn't be pursued as part of a simple bug fix like this.	11 February 2021, 00:35:52 UTC
8750a7c	Tim Foley	10 February 2021, 02:16:28 UTC	Add more to User's Guide (#1698) This change adds a first draft of an Introduction chapter, along with a chapter about the "conventional" features of Slang (when compared to HLSL, GLSL, and C/C++).	10 February 2021, 02:16:28 UTC
03f6389	Yong He	09 February 2021, 16:40:27 UTC	Add getting started documentation (#1697) * Add getting started documentation * wording * wording	09 February 2021, 16:40:27 UTC
53ff724	jsmall-nvidia	08 February 2021, 22:53:02 UTC	Hotfix/doc typo lexical (#1696) * #include an absolute path didn't work - because paths were taken to always be relative. * Fix typo	08 February 2021, 22:53:02 UTC
10a55d8	jsmall-nvidia	08 February 2021, 22:49:45 UTC	DX12 & NVAPI fixes (#1695) * #include an absolute path didn't work - because paths were taken to always be relative. * Fix bugs with m_features on Dx12 and gl. Fix issue about GFX_NVAPI availability. * Fix handling of SLANG_E_NOT_AVAILABLE on renderer startup. * Clarify comment. * Improve comment.	08 February 2021, 22:49:45 UTC
891791e	jsmall-nvidia	08 February 2021, 21:29:31 UTC	Copy SourceLoc when inlining (#1692) * #include an absolute path didn't work - because paths were taken to always be relative. * Copy source loc information when inlining.	08 February 2021, 21:29:31 UTC
df7548e	Yong He	05 February 2021, 22:36:07 UTC	Shader-Object example (#1694)	05 February 2021, 22:36:07 UTC
5fbaccf	jsmall-nvidia	05 February 2021, 19:59:46 UTC	Typo in renderer name for DX12 (#1693) * #include an absolute path didn't work - because paths were taken to always be relative. * Typo for renderer name for DX12.	05 February 2021, 19:59:46 UTC
adb1131	Tim Foley	05 February 2021, 17:01:36 UTC	Initial implementation of interface conjunctions (#1691) The basic feature here is the ability to use the `&` operator to produce the conjunction/intersection of two interfaces. That is, you can have interfaces: interface IFirst { int getFirst(); } interface ISecond { int getSecoond(); } and if you need a generic function where the type parameter `T` must conform to both of these interfaces, you express that by constraining the parameter to the intersection of the interfaces: void someFunction<T : IFirst & ISecond>(T value) { ... } Without this feature, the main alternative an application would have is to define an intermediate interface, like: interface IBoth : IFirst, ISecond {} Forcing users to deal with an intermediate interface creates more work for type authors (they need to remember to inherit from the right combined interface(s)), or for `extension` authors (when you add `ISecond` to a type that used to just support `IFirst`, you had better also add `IBoth`). In the worst case, a family of N related "leaf" interfaces would give rise to an exponential number of intermediate interfaces to represnt the possible combinations. A conjunction like `IFirst & ISecond` is officially its own type, and can be used to declare a type alias: typealias IBoth = IFirst & ISecond; This change only includes the first pass of work on this feature, so there are several caveats to be aware of: * Using a conjunction as part of an inheritance clause is not yet supported (e.g., `struct X : IFirst & ISecond`). This is true even if the conjunction was introduced by an intermediate `typealias` * The `&` syntax introduced here is only parsed in places where only a type (not an expression) is possible. This means you cannot do things like cast to a conjunction with `(IFirst & ISecond)(someValue)`. * This work should apply to conjunctions of more than two interfaces (like `IA & IB & IC`) but that has not yet been tested * In the long run it may be sensible to allow conjunctions that use concrete types, but we really ought to have the semantic checking logic rule that out for now. * During testing, I encountered compiler crashes when trying to use this feature together with `property` declarations. Further investigation and debugging is called for. * The handling of conjunction types is currently incomplete, in that there are many equivalences the compiler does not yet understand. For example, it is clear that `IA & IB` is equivalent to `IB & IA`, but the compiler currently does not understand this and will treat them as different types. A deeper implementation approach is called for. * Conjunctions are currently only supported for generic type parameter constraints, when performing full specialization. Use of conjunctions for existential-type value parameters or with dynamic dispatch is not yet supported.	05 February 2021, 17:01:36 UTC
fb05343	jsmall-nvidia	04 February 2021, 23:45:50 UTC	Fix line offset problem (#1690) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP diagnostics for line number output. * Small param naming change * Use x macro for pass through compile human name lookup/getting. * WIP on parsing downstream compiler output. * Split out parsing into ParseDiagnosticUtil. Added test result of single line. * Dump out the std output on fail to parse diagnostics. * Change test type for syntax-error-intrinsic.slang be TEST not TEST_DIAGNOSTIC * Use Index for StringUtil. * WIP: First pass support for parsing Slang diagnostics. * WIP Testing comparing with ParseDiagnosticUtil with previous ad-hoc mechanism. * Use the new parsing mechanism for diagnostic comparisons. * Fix layout on GLSL, doesn't have CR so runs into main. * Split out switch on outputting intrinsic 'specials'. Output code around intrinsic as emit - so that we get the appropriate indenting (and potentially other benefits). * Improvements to diagnostics parsing. Better error handling, and fallback handling. Added ability to parse downstream compilers without a prefix. Added ability to parse Slang with a prefix. * DownstreamDiagnostic::Type -> Severity and related fixes. * Small fixes around moving from DownstreamDiagnostic::Type -> Severity * Fix handling of 'special intrinsic' expansion * Split out the handling of intrinsic expansion into it's own type and files. * Fixes to reading expected output - for SimpleLine test. * Test using += to check #line output. * A test around += and return. * Small comment fixes. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	04 February 2021, 23:45:50 UTC
c40f10b	Yong He	04 February 2021, 21:50:51 UTC	[gfx] Shader-object driven shader compilation. (#1688)	04 February 2021, 21:50:51 UTC
7f266f1	jsmall-nvidia	04 February 2021, 19:23:32 UTC	DownstreamDiagnostic::Type -> Severity (#1687) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP diagnostics for line number output. * Small param naming change * Use x macro for pass through compile human name lookup/getting. * WIP on parsing downstream compiler output. * Split out parsing into ParseDiagnosticUtil. Added test result of single line. * Dump out the std output on fail to parse diagnostics. * Change test type for syntax-error-intrinsic.slang be TEST not TEST_DIAGNOSTIC * Use Index for StringUtil. * WIP: First pass support for parsing Slang diagnostics. * WIP Testing comparing with ParseDiagnosticUtil with previous ad-hoc mechanism. * Use the new parsing mechanism for diagnostic comparisons. * Improvements to diagnostics parsing. Better error handling, and fallback handling. Added ability to parse downstream compilers without a prefix. Added ability to parse Slang with a prefix. * DownstreamDiagnostic::Type -> Severity and related fixes. * Small fixes around moving from DownstreamDiagnostic::Type -> Severity * Small comment fixes. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	04 February 2021, 19:23:32 UTC
ef283b8	Tim Foley	04 February 2021, 19:15:46 UTC	Change how function-scope static variables lower to IR (#1686) This change pertains to `static` variables in function scope (including things like methods, initializers, property accessors, etc.). Note that it does not have anything to do with global-scope `static` variables or with `static const` variables (whether inside a function or not). The old code generation strategy had a lot of "clever" code to deal with the problem of a `static` variable inside a generic function (or inside a function inside a generic type, etc.). Basically, if you had input code like: int myFunc<T>(int newVal) { static int state = 0; int result = state; state = newVal; return result; } The language semantics are that `myFunc<float3>` should have a different `state` variable than `myFunc<int2>`. The way that the existing codegen handled that was to generate the `state` variable into its own dedicated `IRGeneric`. Something like: generic myFunc_state<T0> { global_var g_ptr : int; return g_ptr; } generic myFunc<T1> { func f(int newVal) { let result : int = load(state<T>); store(state<T1>, newVal); return result; } } The catch there is that you end up needing to generate an entire second `IRGeneric`, and then references to `state` need to explicitly use `specialize` to instantiate that generic using the same parameters as `myFunc` was passed (note how `T0` and `T1` are distinct IR generic parameters, despite both representing `T` here). Things get even more complicated when you consider function-`static` variables with initialization logic, since we need to be sure we only perform that initialization once, but the initialization could refer to arguments of the outer function, and thus needs to be done inside the function body. To handle that case we emit an additional `bool` global if a function-`static` variable has an initializer, and that `bool` gets wrapped up in yet another generic. That whole approach seems silly in retrospect, and a much simpler solution is possible: just emit the function-`static` variable immediately before the IR function it pertains to, which means it will be nested under the same* IR generic if there is one (and at module scope if there isn't). The result is something like: generic myFunc<T1> { global_var state_ptr : int*; func f(int newVal) { let result : int = load(state_ptr); store(state_ptr, newVal); return result; } } This change implements that simplification, and all the same tests pass (including whatever tests we had for function-`static` variables).	04 February 2021, 19:15:46 UTC
4c66c17	jsmall-nvidia	03 February 2021, 21:31:58 UTC	Diagnostic comparison using parsing (#1683) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP diagnostics for line number output. * Small param naming change * Use x macro for pass through compile human name lookup/getting. * WIP on parsing downstream compiler output. * Split out parsing into ParseDiagnosticUtil. Added test result of single line. * Dump out the std output on fail to parse diagnostics. * Change test type for syntax-error-intrinsic.slang be TEST not TEST_DIAGNOSTIC * Use Index for StringUtil. * WIP: First pass support for parsing Slang diagnostics. * WIP Testing comparing with ParseDiagnosticUtil with previous ad-hoc mechanism. * Use the new parsing mechanism for diagnostic comparisons. * Improvements to diagnostics parsing. Better error handling, and fallback handling. Added ability to parse downstream compilers without a prefix. Added ability to parse Slang with a prefix.	03 February 2021, 21:31:58 UTC
a1d543d	Tim Foley	02 February 2021, 23:45:19 UTC	Remove GlobalGenericParamSubstitution (#1684) The `GlobalGenericParamSubsitution` class used to be used to represent the mapping of global-scope generic parameters to their concrete arguments, so that we could make use of those concrete arguments for things like layout. That representation caused a lot of pain for other parts of the compiler, though, because everything that dealt with `Substitution`s needed to account for the possibility of global-generic-param subsitutions even if they logically could not occur in most parts of the compiler. We have since moved to a model where the values for global-scope generic parameters are stored in a single explicit global structure that is used by both layout computation and IR lowering. There is no actual code that construct `GlobalGenericParamSubstitution`s from scratch any more, so all of the support code for them was actually unused. This change removes all the unused code, and shows that the tests still pass without it (even the tests that use global-scope generic parameters).	02 February 2021, 23:45:19 UTC
17d2b24	jsmall-nvidia	02 February 2021, 22:45:56 UTC	Downstream compiler line number test (#1682) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP diagnostics for line number output. * Small param naming change * Use x macro for pass through compile human name lookup/getting. * WIP on parsing downstream compiler output. * Split out parsing into ParseDiagnosticUtil. Added test result of single line. * Dump out the std output on fail to parse diagnostics. * Change test type for syntax-error-intrinsic.slang be TEST not TEST_DIAGNOSTIC	02 February 2021, 22:45:56 UTC
5d755e5	jsmall-nvidia	29 January 2021, 22:12:38 UTC	Small improvements to CUDA doc (#1681) * #include an absolute path didn't work - because paths were taken to always be relative. * Small typo fixes for docs on CUDA target.	29 January 2021, 22:12:38 UTC
0ab4d04	Tim Foley	29 January 2021, 21:17:51 UTC	Fix issue when passing ray query to a subroutine (#1680) The problem would manifest for any code that declared a DXR 1.1 `RayQuery` value, but then only used it as one location in their code. The most common way for this to arise in user code was declaring a `RayQuery` and then handing it off to a helper/worker subroutine. RayQuery<0> myRayQuery; helperRoutine(myRayQuery, ...); The root cause was in the emit logic, where the initialization of `myRayQuery` above (a `defaultConstruct` operation in our IR) was getting folded into its (only) use site. This folding makes some sense, because the initialization of a ray query is not an operation with side effects, but doesn't work in practice because our way of handling default construction in HLSL output is by using a variable declaration. The simple fix here is to ensure that `defaultConstruct` instructions never get folded into use sites. If we decide to revisit the logic here, it might be possible to separate out the case where a `defaultConstruct` is being used as a stand-alone instruction, where we can emit it as: RayQuery<0> myRayQuery; versus cases where the `defaultConstruct` is being used as a sub-expression, such as: helperRoutine(RayQuery<0>(), ...); Whether or not we can emit the latter form (or if it would be equivalent) depends on details of how constructors like this are being implemented in dxc. For now it seems safest to emit things in a form that is obviously expected to work. Aside: Historically, the HLSL language has had no notion of "constructors" as being a thing. A variable that is declared but not initialized in HLSL has always been left uninitialized, since the first version of the language. The `RayQuery` type in DXR 1.1 is the first example of a type that appears to have a C++-style "default constructor," although HLSL as implemented by dxc still does not expose constructors as a user-visible or documented feature. (There is the small detail that the DXR 1.0 `HitGroup` type also relied on C++ constructor syntax, but I'm not aware of anybody using that feature right now, so it is mostly a curiosity.)	29 January 2021, 21:17:51 UTC
da6463a	jsmall-nvidia	28 January 2021, 21:05:49 UTC	README.md update (#1679) * #include an absolute path didn't work - because paths were taken to always be relative. * Added trying out section to README.md Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	28 January 2021, 21:05:49 UTC
615dfba	Yong He	27 January 2021, 18:02:44 UTC	Make own a slang session. (#1678)	27 January 2021, 18:02:44 UTC
a90c850	Tim Foley	26 January 2021, 21:32:03 UTC	Integrate reflection more deeply into gfx layer (#1677)	26 January 2021, 21:32:03 UTC
50676c7	jsmall-nvidia	26 January 2021, 21:04:44 UTC	Obfuscation naming issue fix (#1676) * #include an absolute path didn't work - because paths were taken to always be relative. * Work around for issue with obfuscation (and lack of name hints) leading to names in output not being correctly uniquified. * Improve appendChar Remove unrequired memory juggling to scrub names. * Remove test code. * Small fixes in comments and method called. * Remove linkage decoration on functions that are specialized. * Obfuscation naming with specialization test. * Fix instruction deletion. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	26 January 2021, 21:04:44 UTC
798d773	jsmall-nvidia	26 January 2021, 17:15:08 UTC	Improved NVRTC location finding (#1674) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP more sophisticated mechanism to find NVRTC. * Improve nvrtc searching to include PATH. * Make getting an extension able to differentiate between no extension, and just a . * Add comment. * Add support for searching instance path. * Small improvements around scope and finding NVRTC. * Improve documentation around NVRTC loading.	26 January 2021, 17:15:08 UTC
00fad59	jsmall-nvidia	22 January 2021, 21:18:04 UTC	Add nvrtc shared library/dll names (#1673) * #include an absolute path didn't work - because paths were taken to always be relative. * Add other NVRTC versions. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	22 January 2021, 21:18:04 UTC
6601220	Yong He	22 January 2021, 21:17:28 UTC	Further flatten IR natvis views (#1672) * Further flatten IR natvis views * improvements * formatting Co-authored-by: Yong He <yhe@nvidia.com> Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	22 January 2021, 21:17:28 UTC
76db336	Yong He	22 January 2021, 17:22:45 UTC	Fix existential specialization of mutable buffer loads. (#1671) * Fix existential specialization of mutable buffer loads. * fix Co-authored-by: Yong He <yhe@nvidia.com>	22 January 2021, 17:22:45 UTC
dc063e5	Yong He	22 January 2021, 02:05:53 UTC	Make natvis to discover and display IRInst names more directly (#1670) Co-authored-by: Yong He <yhe@nvidia.com>	22 January 2021, 02:05:53 UTC
3fc90d4	Yong He	21 January 2021, 22:44:01 UTC	Initialize unused fields in packAnyValue (#1669)	21 January 2021, 22:44:01 UTC
b52fcf9	Yong He	21 January 2021, 22:21:55 UTC	Fix D3D12 DescriptorSet::setSampler bug (#1668)	21 January 2021, 22:21:55 UTC
b762c75	Yong He	21 January 2021, 21:55:38 UTC	Fix reflection to correctly report descriptor ranges of `StructureBuffer`s of existential types. (#1667)	21 January 2021, 21:55:38 UTC
6c8135f	Yong He	21 January 2021, 20:21:00 UTC	Add `StructuredBuffer` support in `gfx`. (#1666)	21 January 2021, 20:21:00 UTC
4b97833	Yong He	21 January 2021, 18:25:00 UTC	Fix type legalization bug involving nested empty struct. (#1665)	21 January 2021, 18:25:00 UTC
3d21e7a	Tim Foley	21 January 2021, 16:38:06 UTC	Upgrade slang-binaries for glslang 11.1.0 (#1664) This change also switches the build back to using prebuilt glslang binaries instead of always building from source.	21 January 2021, 16:38:06 UTC
660cf7a	Tim Foley	20 January 2021, 17:23:39 UTC	Update glslang to 11.1.0 (#1662) * Update glslang to 11.1.0 This change pulls new versions of glslang, spirv-headers, and spirv-tools as submodules, and makes the necessary changes to other files in the repository to get it all building (at least on Windows). This change also enables building of glslang from source by default, so that we can easily generate new binaries for inclusion in the `slang-binaries` repository. * fixup: missing file	20 January 2021, 17:23:39 UTC
c6fd4a5	Yong He	19 January 2021, 17:10:15 UTC	Make `ShaderCursor` no longer depend on core. (#1661)	19 January 2021, 17:10:15 UTC
1296c7b	Yong He	18 January 2021, 06:00:49 UTC	Make `gfx` compile to a DLL. (#1660) * Make `gfx` compile to a DLL. * Fix cuda * Fix cuda build * Bug gl screen capture bug.	18 January 2021, 06:00:49 UTC
2a5d5b3	Tim Foley	15 January 2021, 20:10:06 UTC	Convert more tests to use shader objects (#1659) This change converts a large number of our existing tests to use the `ShaderObject` support that was added to the `gfx` layer. In many cases, tests were just updated to pass `-shaderobj` and the result Just Worked. In other cases, a `name` attribute had to be added to one or more `TEST_INPUT` lines. For tests that did not work with shader objects "out of the box," I spent a little bit of time trying to get them work, but fell back to letting those tests run in the older mode. Future changes to the infrastructure will be needed to get those additional tests working in the new path. Along with the changes to test files, the following implementation changes were made to get additional tests working: * Because the shader object mode uses explicit register bindings (from reflection), the hacky logic that was offseting `u` registers for D3D12 based on the number of render targets gets disabled (by another hack). * The "flat" reflection information coming from Slang was not correctly reporting "binding ranges" for things that consumed only uniform data (which would be everything on CUDA/CPU), so it was refactored to properly include binding ranges for anything where the type of the field/variable implied a binding range should be created (even if the `LayoutResourceKind` was `::Uniform`). * A few fixes were made to the CUDA implementation of `Renderer`, in order to get additional tests up and running. Most of these changes had to do with texture bindings, which hadn't really been tested previously. In addition, a few changes were made that were attempts at getting more tests working, but didn't actually help. These could be dropped if requested: * As a quality-of-life feature (not being used) the `object` style of `TEST_INPUT` line is upgraded to support inferring the type to use from the type of the input being set. * Any `object` shader input lines get ignored in non-shader-object mode.	15 January 2021, 20:10:06 UTC
f834f25	Yong He	14 January 2021, 23:48:54 UTC	COM-ify all slang-gfx interfaces. (#1656) * COM-ify all slang-gfx interfaces.	14 January 2021, 23:48:54 UTC
ac76997	jsmall-nvidia	14 January 2021, 23:03:51 UTC	Adding missing VisualStudio lz4 project (#1657) * #include an absolute path didn't work - because paths were taken to always be relative. * Added missing lz4 visual studio project.	14 January 2021, 23:03:51 UTC
723796a	jsmall-nvidia	11 January 2021, 20:24:11 UTC	LZ4 compression support (#1654) * #include an absolute path didn't work - because paths were taken to always be relative. * Testing out use of lz4. * Added ICompressionSystem, and LZ4 implementation. * Add support for deflate compression. Simplify compression interface - to make more easily work across apis. * WIP on CompressedFileSystem. * ImplicitDirectoryCollector * SubStringIndexMap - > StringSliceIndexMap. * WIP save stdlib in different containers. * Support for different archive types for stdlib. * Fix project. * CompressedFileSystem -> ArchiveFileSystem. Added CompressionSystemType::None * Added ArchiveFileSystem * Fix problem RiffFileSystem load withoug compression system. * Test archive types. Improve diagnostic message. * Fix typo in testing file system archives. * Split out archive detection. * Fix gcc warning issue. * Fix warning. * RiffArchiveFileSystem -> RiffFileSystem Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	11 January 2021, 20:24:11 UTC
5554777	Yong He	11 January 2021, 17:11:52 UTC	Make `gfx::Renderer` a COM interface. (#1653) * Make `gfx::Renderer` a COM interface. This is a first step towards making the `gfx` library expose a COM compatible DLL interface. Remaining classes will come as separate PRs. * Fixup project files * Fix calling conventions * Make gfx::createRenderer() functions increase ref count by 1 Make renderer createFunc return via out parameter	11 January 2021, 17:11:52 UTC
e24c5a6	Tim Foley	08 January 2021, 00:01:48 UTC	Fill in some missing bits of capability API (#1652) * Fill in some missing bits of capability API * Make invalid/unknown capability have a zero value (this aligns it better with the public API for `SlangProfileID`, so that the two can be merged down the line) * Actually provide an implementation of `spFindCapability()` public API * fixup: bug fixes for renumbering invalid capability atom	08 January 2021, 00:01:48 UTC
d84f458	Tim Foley	07 January 2021, 21:16:18 UTC	Add a -capability command-line option (#1651) This provides a stand-alone option distinct from `-profile` that can be used to add capabilities to a target. A test has been added to confirm that `-profile X -capability Y` works the same as `-profile X+Y`. The intention is that this option could be used in applications that use the API to set up their target but then use the options-parsing logic to handle things like capabilities. Note: that latter bit has not been confirmed, so it is possible that this approach does not actually suffice for hybrid API + options usage. That will need to be confirmed in follow-up work.	07 January 2021, 21:16:18 UTC
66d4466	Tim Foley	07 January 2021, 20:18:55 UTC	Add support for [noinline] attribute (#1650) This adds the `[noinline]` attribute to the front-end, and passes it through when generating HLSL output. Notes: * This change doesn't include a test since the dxc version I have locally parses `[noinline]` but then generates DXIL that fails validation. * This change doesn't include logic to handle `[noinline]` for other targets. Notably, SPIR-V has decorations that convey the same intention, but we don't yet take advantage of the GLSL extension(s) that would let us generate those decorations. * By necesstiy, `[noinline]` is only a "strong suggestion" and not actually something the compiler can ever guarantee/enforce.	07 January 2021, 20:18:55 UTC
9263651	Yong He	06 January 2021, 20:58:57 UTC	Refactor GUI/Window utils out of gfx library (#1649) Co-authored-by: Yong He <yhe@nvidia.com>	06 January 2021, 20:58:57 UTC
706d4f9	Tim Foley	05 January 2021, 21:01:17 UTC	Add basic GLSL support for SV_Barycentrics (#1648) * Add basic GLSL support for SV_Barycentrics This change allows for fragment shader varying inputs marked with the `SV_Barycentrics` semantic to be mapped to GLSL code using the `gl_BaryCoordNV` builtin variable (from he `GL_NV_fragment_shader_barycentric` extension). This is the simplest possible change to get the functionality up and running, and it leaves out many things that could be desired in a more feature-complete version of the feature later: * There is no support for alternative extensions that provide similar functionality. Selection of which extension to favor could eventually be based on the "capability" work that has been put in place. * There is no attempt made to check that the input has the expected type (or to coerce it if it doesn't), so for now this is only going to be guaranteed to work for a `float3` input. * This change does not expose the `pervertexNV` qualifier added in the `GL_NV_fragment_shader_barycentric` extension, which can be used by a shader to access the uninterpolated vertex inputs. The last issue is an important one, since the HLSL `GetAttributeAtVertex` function seems to be defiend to work with any incoming varying parameter that was marked with `nointerpolation`. When we have a `nointerpolation` input, it would seem that we need to know whether it will be used with `GetAttributeAtVertex` (in which case it should be declared as a `pervertexNV` array input in GLSL) or not (in which case it should be declared as a `nointerpolation` input, without an array). * fixup: missing file	05 January 2021, 21:01:17 UTC
b4f9462	Tim Foley	05 January 2021, 17:00:00 UTC	Use "capability" system to select VKRT extension (#1647) * Use "capability" system to select VKRT extension Slang currently supports translation of ray tracing shader code to Vulkan GLSL code that uses the `GL_NV_ray_tracing` extension. A multi-vendor equivalent of that extension has been released as `GL_EXT_ray_tracing` and we want Slang to support that extension as well. At the simplest, making the change from one extension to the other is just a matter of changing a few strings, since it does not appear that anything of significance was changed at the GLSL level (or even in SPIR-V). Where this gets trickier is when we have users who want us to support both extensions, and to be able to switch between them. The solution we've implemented here more or less amounts to: * If you don't tell the compiler which extension to use, it will default to `GL_EXT_ray_tracing` (the newer multi-vendor one). * If you explicitly want the older extension, you can opt into it using the `-profile` option or via a new API for explicitly adding capabilities to your target. Making that work required a few different kinds of changes: * The options parsing and public API needed ways to add optional capabilities to a target. * During GLSL code emit, we can check the capabilities that were added to the target to see if the `GL_NV_ray_tracing` extension was explicitly enabled and, if not, default to using the `GL_EXT_ray_tracing` names for things. This step is needed because some of the modifiers/attributes involved in the extension have to be handled explicitly in the code generator rather than implicitly as part of mapping intrinsic functions. * We add two different translations to the relevant operatiosn in the stdlib, one marked with each of the extensions. If profile/capability-based overload resolution can be relied on to pick the right one, this should Just Work. * Next, a bunch of work had to go into making capability-based overloading Just Work for the purposes of this change. There's been a nearly complete reworking of the implementation of `CapabilitySet` here to make it more suitable for our needs. * The tests that were using ray tracing translation for Vulkan needed to be updated. For some of them I updated their baselines to use `GL_EXT_ray_tracing` so that they can test the new path. For others, I updated the command line for the test case so that it explicitly opts into using `GL_NV_ray_tracing`. The result is that we have some coverage of each extension. I would have liked to have each test run in both modes, but our pass-through glslang support doesn't support `-D` options, so I couldn't take that step easily. This change does not add support for `GL_EXT_ray_query`, the extension that supports "DXR 1.1" style queries under Vulkan. Adding support for that extension should hopefully be a smaller step because it doesn't have the same multiple-extensions issue. This change does not address a lot of possible avenues for improvement or cleanup around the capability system. It focuses only on those changes that are necessary to make the ray tracing feature work and leaves the rest for future work. * fixup: infinite loop * Comment-only change to retrigger TC build	05 January 2021, 17:00:00 UTC
f5fffa9	Dietrich Geisler	18 December 2020, 17:10:10 UTC	Heterogeneous Flag Error Visibility (#1642) * PR to fix issue #1638. This change introduces a diagnostic sink to the emitModule function, and updates all associated calls to that function. Additionally, this commit updates the heterogeneous hello world example to not need the entry and stage flags for simplicity. * Updated emit-cpp per suggested changes Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	18 December 2020, 17:10:10 UTC
0fa3bcf	Yong He	15 December 2020, 20:57:55 UTC	Cleanup CUDA renderer. (#1644) * Cleanup CUDA renderer. * More cleanup * fixes. * update comments Co-authored-by: Yong He <yhe@nvidia.com>	15 December 2020, 20:57:55 UTC
77bc70e	jsmall-nvidia	15 December 2020, 19:04:10 UTC	OSX Build/glslang premake fix (#1641) * #include an absolute path didn't work - because paths were taken to always be relative. * Improve docs. Fix premake build of glslang. * More improvements to the building.md doc.	15 December 2020, 19:04:10 UTC
1bc7948	jsmall-nvidia	14 December 2020, 21:18:51 UTC	Enable embedding stdlib for github builds (#1640) * #include an absolute path didn't work - because paths were taken to always be relative. * Enable building with embedding stdlib.	14 December 2020, 21:18:51 UTC
856d7d3	Yong He	11 December 2020, 17:42:23 UTC	Implements CUDA renderer in gfx. (#1637) * Implements CUDA renderer in gfx. * Revert unnecessary change. * Revert unnecessary changes. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	11 December 2020, 17:42:23 UTC
992778e	Tim Foley	11 December 2020, 16:50:43 UTC	Add first steps toward a "capability" system (#1636) * Add first steps toward a "capability" system We already have cases in the stdlib where we mark declarations as being specific to certain targets, e.g.: ``` // My ordinary function to add two numbers. // Works everywhere. // void myFunc(int a, int b) { return a + b; } // On the "coolgpu" target, we can use a secret intrinsic // that adds numbers even faster! // __specialized_for_target(coolgpu) void myFunc(int a, int b) { return __secretIntrinsic(a, b); } ``` The existing logic for dealing with these modifiers (`__specialized_for_target` and `__target_intrinsic`) was almost entirely string-based. We would turn the chosen compilation target into a string, and then use that to try and search for the "best" definition of a function at a few steps: * During IR linking, we always pick one definition of an `[import]`ed function, and that definition will be the one with the "best" target-specialization modifier (if any) * During final code generation, we always look up the "best" target-intrinsic modifier, and use it as the template for the code we output. This change preserves the basic flow there, but replaces the ad hoc string-based logic with something a bit more principled, in terms of a new `CapabilitySet` type. A `CapabilitySet` represents a set of zero or more atomic features (here represented as `CapabilityAtom`s). What a `CapabilitySet` means depends on how and where it is used: * A compilation target implies a `CapabilitySet` where the contents of the set are the features the target supports. * A `CapabilitySet` attached to a declaration (or a modifier on that declaration) describes a set of feature that declaration requires. The current implementation of `CapabilitySet` is wasteful and inefficient, but that is something we can iterate on over time. In practice, most of the current code only ever uses capability sets that are either empty (because they represent a function with no specific requirements) or singleton (because they represent asingle atomic capability like "is a GLSL target," "is an HLSL target," etc.). The main goal here was to put in the skeleton of a new system, including some of the features it might need down the line, and then to leave changes that eventually use the greater flexibility for later. Eventually, the capability system should encompass: * Differences between shader model versions, GLSL versions, SPIR-V versions, etc. (currently tracked with other modifiers) * Optional extensions, and functions that are made available only with certain extensions (currently tracked with other modifiers) * Front-end checking that the call graph of a program doesn't violate any capability-requirements (e.g., having a GLSL+HLSL portable function call a GLSL-only subroutine) * Hypothetically we can also try to fold stage-specific (vertex-only, fragment-only, etc.) functions into this system, but doing so would require more linker cleverness if we allow overloading on stages (since we might have to clone a caller if it calls through to a callee with multiple stage-specific versions) One important complication that the system has to deal with just because of the "do what I mean" nature of the current compiler is that somethings a current Slang user might compile for target X and specify version N, but then use a function that actually requires version N+1 of that target. Currently the Slang compiler silently "upgrades" the version(s) used by user code in these cases, because it is often what users want in cross-compilation scenarios. Dealing with the "silent upgrade" situation requires us to be a little careful and sometimes pick a "best" capability set that doesn't appear to be supported on our target. Refining that system and potentially getting rid of the "do what I mean" behavior over time could be a goal for future changes. * fixup: handle case where value is incompatible during linking	11 December 2020, 16:50:43 UTC
4337338	jsmall-nvidia	10 December 2020, 19:04:29 UTC	Building with embedded stdlib (#1634) * #include an absolute path didn't work - because paths were taken to always be relative. * Move reflection to reflection-api. * Slight reorg to pull out potentially Slang internal functions from the reflection API impls. * Remove visual studio projects * Fix for slang-binaries copy. * Add the visual studio projects in build/visual-studio * Remove miniz project. * Differentiate the linePath from the filePath. * Improve comment in premake5.lua + to kick of CI. * Kick CI. * Use COM compile request for calls to functions inside api-less-slang. Add static-slang project. * Fix const typo issue. * Don't include 'core' link in 'api-less-slang' * Removed static-slang lib causes problems on linux with linking. Embed Slang stdlib Added StaticBlob Added dumpSourceBytes Use ConstArrayView for the archive. At startup allow loading of zip with stdlib. Made -save-stdlib -load-stdlib take a name Added '-save-stdlib-bin-source' to save out serialized stdlib as source. * Ability enable/disable stdlib embedding. * Fix problem with moduleDecl not having module pointer set when serialized in. * Set of debugdir for slang-test and examples. * Add slang-stdlib-api.cpp * Update slang filters for VS. * Try to use pic, and -mcmodel=medium * Some more efforts ot make premake work. * WIP premake5.lua from previously working version. * Remove api-less-slang project. * Disable dllexport on gcc/clang. * Embed via slangc-bootstrap. * Fix slang-profile. Always compiles without stdlib. * Use pic "On" * Remove slangc-bootstrap and embed-stdlib-generator if embedding not required. Make bootstrap run the generators. * Improve comments in premake5.lua. Kick off another CI build. * Remove generation of stdlib source from std-lib-serialize.slang	10 December 2020, 19:04:29 UTC
e4a8251	Yong He	10 December 2020, 17:43:09 UTC	Move ShaderObject to be under renderer interface. (#1633) * Move ShaderObject to be under renderer interface. * Make `createPipelineState` take `const PipelineStateDesc&`. Move ShaderCursor implementation to a cpp file	10 December 2020, 17:43:09 UTC
b8e1f62	Tim Foley	08 December 2020, 02:18:31 UTC	Fix a subtle bug introduced into type legalization (#1632) The refactor of type legalization in PR #1594 introduced a subtle problem where an IR instruction might be removed from the hierachy (perhaps because its parent was removed during legalization) but would still be on the work list. Legalization of such instructions is wasteful (since it would never impact the output), but it also creates a problem if we try to insert new legalized instructions next to such a removed instruction. The logic for inserting an instruction before/after another asserts that the sibling instruction must have a parent, and leads to a failure in debug builds and a potential crash in release builds. This change adds a bit of defensive code to skip any instructions that appear to have been removed from the hierarchy (because they have no parent and are not the root/module instruction). An alternative approach would be to try to detect these instructions at the point where they would be added to the work list, but this approach seems simpler and more general.	08 December 2020, 02:18:31 UTC
0404fef	Tim Foley	07 December 2020, 22:17:17 UTC	Fix mistake where some public API functions weren't extern "C" (#1631) This was a serious problem that meant that some of our public API functions (exported from the DLL) had mangled C++ symbol names (which are not guaranteed to be portable across different compilers). The fix here is the expedient one: just add `extern "C"` to the relevant functions. The simplicity has a cost, though, in that this change introduces a significant break in binary compatibility. Source code should be compatible before/after this change, but it would be necessary to match an application and Slang shared library so that they agree on the mangled names of these symbols. We will need to treat this as a breaking release when this change goes out. Note that because of the way that C++ mangled names were being used, we already have introduced breaking changes to the binary interface in a recent change that made `SlangCompileRequest` into a `typedef` instead of a forward-declared `struct` type. To be clear, the breakage was due to the missing `extern "C"` and not the content of that change; the change would have been binary-compatible if the symbol names were correct. Given that the cat is out of the bag to some extent (our next release is going to break compatibility one way or another), it seems best to just get the whole thing sorted at once.	07 December 2020, 22:17:17 UTC
d7ce74a	Tim Foley	07 December 2020, 17:29:37 UTC	"Shader Toy" example and related fixes (#1629) * "Shader Toy" example and related fixes This change introduces a new `shader-toy` example program that is primarily designed to show how Slang's features for type-based encapsulation and modularity can be applied to modularity for effects along the lines of those from `shadertoy.com`. The Example ----------- The example is being checked in with an example "toy" effect that I hastily put together, so that it would not be encumbered with any IP concerns. I wrote the effect using the shadertoy.com editor, so I can be sure it is valid GLSL. During bringup of the application I used a pre-existing and larger effect for testing, so some of the support code that was added is not being used at present. The big-picture idea here is to have an exmaple that shows how to modularize things using Slang interfaces and generics, and then to use the Slang compiler API to manage the compilation, composition, specialization, and linking steps. For better or worse this leads to the sequence of API calls involved being much longer than what was in something like the `hello-world` example. Future Work (Example) --------------------- There is a lot of room for improvement and expansion here, so this should be viewed as a checkpoint of work in progress rather than something I'm claiming as a finalized demonstration of all we'd like to achieve. Areas for future work include: * We need to copy the integration of "Dear, IMGUI" that was already done for the `model-viewer` example so that this example can have a UI. * Now that the compilation flow is broken into all these additional steps, it should be possible to have the application load multiple effects as distinct modules, and then provide a UI for switching between them. The chosen effect module would be used to specialize the top-level shader(s) before kernel generation. * The checked-in logic includes a compute shader that can execute an effect, but that hasn't been tested nor has it been wired up to any kind of UI. We should have a way to switch between multiple execution methods, with a goal of eventually including CPU execution. * The "GLSL compatibility" code needs a lot of improvements before it is likely to be usable for a nontrivial number of shaders. Some of that work is waiting on Slang compiler fixes, though. * We should consider allowing the individual "toy" effects to define their own uniform parameters and expose those via a UI and reflection. The catch in this case is not that this would be difficult to do, but that it would be a semantic change to how shader toy effects currently work. The Compiler Fixes ------------------ Doing this work exposed a few bugs in Slang, and this change includes fixes for the ones that were quick to address. We already had logic in `slang-check-shader.cpp` that was validating the entry points in a compile request - either by checking the explicitly-listed entry points, or by scanning for `[shader("...")]` attributes. The problem is that the routine that did that checking was not being invoked on all compiles. The logic that handled entry points was only being run for manual compiles using `SlangCompileRequest`, while anything using `import` or `loadModule` would ignore entry points. I refactored the relevant code into a subroutine that will be invoked in all compilation scenarios. There were already `TODO` comments in `SpecializedComponentType` which made the point about how a specialized entry point like `myShader<YourType>` would need to properly show that it has dependencies on both the module that defines `myShader` and the module that defines `YourType`, while only the former was being handled at present. I went ahead and implemented the logic to scan the generic arguments for a specialized compoment type in order to determine what module(s) the arguments depend on (both type arguments and witness tables). With that change, using `IComponentType::link` on a specialized component will properly pull in the module(s) that the generic arguments come from. In `slang-ir-legalize-types.cpp` we could run into assertion failures in debug builds because of code trying to legalize layout `IRAttr`s for fields or parameters with types that need legalization. In practice it is safe to skip these layout attributes, because legalization of the fields/parameters they pertain to would result in creation of entirely new layout attributes, and the old ones would then be unreferenced. Future Work (Fixes) ------------------- There are other compiler bugs that this work exposed, but which this change does not address. These will need to be resolved as part of subsequent changes: * Slang allows for default-initialization of variables of a generic type. That is, given `<T : ISomething>` a user is allowed to declare `T x = {};` and the Slang front-end does not complain. Instead, this leads to an internal compiler error during IR lowering. * The Slang `__init()` feature probably needs to be upgraded to a properly supported feature, and we probably need a way to make implementing default-initialization an easy thing (e.g., any `struct` type that has initial-value expressions for all its fields should automatically and implicitly satsify an `init();` requirement declared in an interface) * Iniside an `__init()` definition, code has mutable access to members of the enclosing type, but for some reason the front-end is incorrectly treating `this` as immutable in those contexts. As a result you can write to `someField` but not `this.someField`. * User-defined operator overloads flat out don't work (which isn't surprising given that no clients have decided to use them yet, and we have no test coverage for them). This is actually due to the shadowing rules being used for lookup right now, so a fix for this issue is going to have far-reaching consequences around what overloads are visible where (and anything that impacts overload resolution is a big can of worms, including around performance). * fixup: test case had missing main function	07 December 2020, 17:29:37 UTC
e98c32f	Yong He	04 December 2020, 18:44:03 UTC	add windows release script (#1627) Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	04 December 2020, 18:44:03 UTC
47ed0f6	jsmall-nvidia	04 December 2020, 18:03:29 UTC	Projects in 'build' and Slang API separation (#1624) * #include an absolute path didn't work - because paths were taken to always be relative. * Move reflection to reflection-api. * Slight reorg to pull out potentially Slang internal functions from the reflection API impls. * Remove visual studio projects * Fix for slang-binaries copy. * Add the visual studio projects in build/visual-studio * Remove miniz project. * Differentiate the linePath from the filePath. * Improve comment in premake5.lua + to kick of CI. * Kick CI.	04 December 2020, 18:03:29 UTC
277780a	Yong He	03 December 2020, 22:48:42 UTC	Add github action to verify vs project file consistency. (#1625) * Add github action to verify vs project file consistency. * fix solution files * fix project files	03 December 2020, 22:48:42 UTC
a827798	jsmall-nvidia	03 December 2020, 17:55:29 UTC	Added miniz Visual Studio Project (#1623)	03 December 2020, 17:55:29 UTC
44c0a56	Yong He	03 December 2020, 16:23:05 UTC	Add shader object parameter binding to renderer_test. (#1622) * Add shader object parameter binding to renderer_test. * remove multiple-definitions.hlsl * Fix cuda implementation. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	03 December 2020, 16:23:05 UTC
ad5dda9	Tim Foley	02 December 2020, 22:12:51 UTC	Fix [mutating] generic methods (#1618) Slang generates code that turns the implicit `this` parameter of a method into an explicit parameter. The logic that decides whether that parameter should be `inout` is a bit involved, and there was a bug where a generic method would lead to the use of an `in` modifier (the default) and override the `inout` modifier that was requested by the method itself. This change fixes the logic to treat generic declarations in the parent chain of a leaf method as having no bearing on whether an implicit `this` parameter should be `inout` or not. A test case is included that breaks with the old behavior, and demonstrates that a generic `[mutating]` method can now work correctly.	02 December 2020, 22:12:51 UTC

Newer
Older