Revision history - None - origin: https://github.com/shader-slang/slang

visit type:

Revision	Author	Date	Message	Commit Date
95a61ab	Tim Foley	05 April 2021, 20:40:45 UTC	Fix a bug in the "operator cache" (#1784) In order to speed up compilation, the semantic checking step uses a cache for overload resolution in the case of an operator being applied to operations of basic scalar types and vectors/matrices thereof. The logic for construct keys for that cache was defensive against the case of, e.g., `vector<int,N>` where the element count `N` of the vector was not a literal value but a generic parameter. For some reason it did not have equivalent safeguards for a case like `vector<T,2>` where the element type was not a basic type, and it would instead assume all vector/matrix types had basic types as their element type. This change fixes the logic to make it properly defensive against this case. Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>	05 April 2021, 20:40:45 UTC
086ecf4	Yong He	05 April 2021, 20:31:05 UTC	Transient root shader object. (#1782)	05 April 2021, 20:31:05 UTC
dd662f5	jsmall-nvidia	05 April 2021, 16:51:52 UTC	Added tests/current-bugs (#1781) * #include an absolute path didn't work - because paths were taken to always be relative. * Added a current-bugs folder in tests for active (ie with issue) bug tests demonstrating the problem. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	05 April 2021, 16:51:52 UTC
fa4eda2	Ali Emre Gülcü	04 April 2021, 09:34:48 UTC	Fixed a typo on introduction (#1783)	04 April 2021, 09:34:48 UTC
e14d9ff	jsmall-nvidia	02 April 2021, 17:49:44 UTC	Repro fixes (#1780) * #include an absolute path didn't work - because paths were taken to always be relative. * Fix handling names in repro with : Handle if file info is not set - which means it's contents was not loaded.	02 April 2021, 17:49:44 UTC
e1ad4a2	jsmall-nvidia	01 April 2021, 23:37:54 UTC	cygwin - disable VK/CUDA (#1777) * #include an absolute path didn't work - because paths were taken to always be relative. * Disable Vulkan and CUDA on cygwin. Co-authored-by: Yong He <yonghe@outlook.com>	01 April 2021, 23:37:54 UTC
0ec8e5b	Tim Foley	01 April 2021, 23:15:09 UTC	Refactor D3D12 renderer root signature creation (#1779) This change originated as an attempt to re-enable a test case, but it has ended up disabling more tests (for good reasons) than it re-enables. The main change here is a significant overhaul of the way that the D3D12 render path extracts information from the Slang reflection API to produce a root signature. There were also some supporting fixes in the reflection information to make sure it returns what the D3D12 back-end needed. The big picture here is that the D3D12 path now uses the descriptor ranges stored in the reflection data more or less directly. It still needs to use register/space offset information queried via the "old" reflection API, but it only does so at the top level now, for the program and entry points themselves. All other layout information is derived directly from what Slang provides. Smaller changes: * The "flat" reflection API was expanded to include `getBindingRangeDescriptorRangeCount()` which was clearly missing. * The "flat" reflection results for a constant buffer or parameter block that didn't contain any uniform data and was mapped to a plain constant buffer needed to be fixed up. That logic is still way to subtle to be trusted. * Several additional tests were disabled that relied on static specialization, global/entry-point generi type parameters, structured buffers of interfaces or other features we don't officially support with shader objects right now. All of the affected tests were somehow passing by sheer luck and because they often passed in specialization arguments via explicit `TEST_INPUT` lines. * The `inteface-shader-param` test is re-enabled now that we can properly describe its input with the new `set` mode on `TEST_INPUT` * `ShaderCursor::getElement()` can now be used on structure types (in addition to arrays) to support by-index access to fields * The `TEST_INPUT` system was expanded to support both by-name and by-index setting of structure fields for aggregates * The `TEST_INPUT` system was expanded to allow an `out` prefix to mark parts of an expression as outputs on a `set` lines * The `TEST_INPUT` system was expanded so that anything that would be allowed on a `TEST_INPUT` line by itself (like `ubuffer(...)`) can now be used as a sub-expression on a `set` line Co-authored-by: Yong He <yonghe@outlook.com>	01 April 2021, 23:15:09 UTC
9475b11	jsmall-nvidia	01 April 2021, 22:59:24 UTC	Associating GUID (or UUID) with types (#1776) * #include an absolute path didn't work - because paths were taken to always be relative. * Add mechanism to embed guid inside of type.	01 April 2021, 22:59:24 UTC
2a32fae	jsmall-nvidia	01 April 2021, 22:03:40 UTC	Add compiler-core project files for VS (#1778) * #include an absolute path didn't work - because paths were taken to always be relative. * Added compiler-core VS projects...	01 April 2021, 22:03:40 UTC
fa31d21	jsmall-nvidia	01 April 2021, 17:39:11 UTC	Added compiler-core project (#1775) * #include an absolute path didn't work - because paths were taken to always be relative. * Split out compiler-core initially with just slang-source-loc.cpp * More lexer, name, token to compiler-core. * Split Lexer and Core diagnostics. * Move slang-file-system to core. * Add slang-file-system to core. * More DownstreamCompiler into compiler-core * Fix typo. * Add compiler-core to bootstrap proj. * Small fixes to premake * For linux try with compiler-core * Remove compiler-core from examples. * Added NameConventionUtil to compiler-core * Add global function to CharUtil to hopefully avoid linking issue. * Hack to make linkage of CharUtil work on linux.	01 April 2021, 17:39:11 UTC
3f1632a	Yong He	31 March 2021, 18:35:17 UTC	`gfx` explicit transient resource management. (#1774)	31 March 2021, 18:35:17 UTC
5fde038	jsmall-nvidia	31 March 2021, 17:11:49 UTC	Support for __LINE__ and __FILE__ in preprocessor (#1772) * #include an absolute path didn't work - because paths were taken to always be relative. * First pass support for __LINE__ and __FILE__. * Test include handling with __FILE__ Fix diagnostic compare when input is empty. * Fix some issues in preprocessor handling of special macros like __LINE__ Add a more complex test. * Use CONCAT2 in tests, because preprocessor doesn't quite get parameter expansion correct. * Make __FILE__ and __LINE__ behave more like Clang/Gcc. * A test for preprocessor bug. * Fix __LINE__ and __FILE__ in macro expansion, should be initiating location. * Fix some comments. * Small tidy up around builtin macros. * Small improvements for macro type names. Escape found paths.	31 March 2021, 17:11:49 UTC
5fefb12	Yong He	30 March 2021, 21:08:38 UTC	Update 04-interfaces-generics.md	30 March 2021, 21:08:38 UTC
59fdf73	Yong He	30 March 2021, 21:02:07 UTC	Override NOTE font size (#1773)	30 March 2021, 21:02:07 UTC
a0ef865	Yong He	30 March 2021, 20:48:38 UTC	Rename README.md to index.md	30 March 2021, 20:48:38 UTC
c4d8551	Yong He	30 March 2021, 20:40:20 UTC	Move user-guide table of contents to _includes dir (#1771)	30 March 2021, 20:40:20 UTC
6c5b463	Yong He	30 March 2021, 20:31:32 UTC	Add layout front matter specifier for user-guide docs (#1770)	30 March 2021, 20:31:32 UTC
bac7f63	Yong He	30 March 2021, 20:28:26 UTC	Update 01-get-started.md	30 March 2021, 20:28:26 UTC
4fcfd04	Yong He	30 March 2021, 20:27:08 UTC	Create toc.html	30 March 2021, 20:27:08 UTC
12b1634	Yong He	30 March 2021, 20:26:18 UTC	Update 00-introduction.md	30 March 2021, 20:26:18 UTC
997ffa1	Yong He	30 March 2021, 20:25:47 UTC	Update and rename documentation.html to user-guide.html	30 March 2021, 20:25:47 UTC
91ba42a	Yong He	30 March 2021, 20:21:32 UTC	Update documentation.html	30 March 2021, 20:21:32 UTC
93288a5	Yong He	30 March 2021, 20:05:20 UTC	Update documentation.html	30 March 2021, 20:05:20 UTC
0d01285	Yong He	30 March 2021, 20:02:25 UTC	Update documentation.html	30 March 2021, 20:02:25 UTC
b15f281	Yong He	30 March 2021, 19:37:16 UTC	Update documentation.html	30 March 2021, 19:37:16 UTC
f30c6e3	Tim Foley	30 March 2021, 19:31:06 UTC	Organize landing page (#1769) The landing page (`README.md`) has been growing larger and less tidy over time as we try to cram more and more information into it. This change makes a few edits to try to make the landing page shorter and more to the point: * Streamline the opening lines and try to make them focus on the credibility of the system * Break off the list of major features into its own subsection and try to highlight the ones that our current users say they benefit from the most * Move a lot of the information about documentation, examples, Shader Playground, etc. into their own sub-pages to avoid clutter * Break out the list of dependencies in the `License` section to make sure we are being accurate With this change the landing page links to the User's Guide directly, so we probably need to get that rendering nicely ASAP.	30 March 2021, 19:31:06 UTC
9fed1f3	Yong He	30 March 2021, 19:27:10 UTC	Create documentation.html	30 March 2021, 19:27:10 UTC
580040b	Yong He	30 March 2021, 19:24:52 UTC	Update README.md	30 March 2021, 19:24:52 UTC
c9ce5b9	Yong He	30 March 2021, 19:24:39 UTC	Update README.md	30 March 2021, 19:24:39 UTC
ae73d50	Yong He	30 March 2021, 19:11:25 UTC	Update 00-introduction.md	30 March 2021, 19:11:25 UTC
69f956f	Yong He	30 March 2021, 18:47:33 UTC	Update 00-introduction.md	30 March 2021, 18:47:33 UTC
2df2267	Yong He	30 March 2021, 18:43:11 UTC	Update README.md	30 March 2021, 18:43:11 UTC
488e7cd	Tim Foley	30 March 2021, 15:38:33 UTC	Add a streamlined syntax for TEST_INPUT lines (#1768) This change allows the `TEST_INPUT` syntax used by `render-test` to support aggregate values with a single input line more easily. The test writer can now use a syntax like: ``` //TEST_INPUT:set someVar = 3.0 ``` Input lines that start with the `set` keyword will now use a simpler `dst = src` format (instead of `dst:name=src` as the existing syntax used). The right-hand side expression can include: * Numeric literals, both integer and floating-point (currently only supporting 32-bit scalar types; we could fix this later) * Arrays, consisting of zero or more comma-separated expressions inside `[]` * Aggregates, consisting of zero or more comma-separated "fields" inside `{}`. A field can either be `name: <expr>` or just `<expr>` * Objects, which can be written as either `new SomeType{ <fields> }` or `new{ <fields> }` in the case where the type is know-able from context With this approach is should be possible to support almost arbitrary-type inputs on a single line. For now, I have used this support to re-enable an existing test that had been disabled due to lack of support for setting up arrays of objects. Major things left to do: * The new syntax doesn't support the existing cases we had for `Texture2D`, etc. Those should probably be supported but I'd like to find a way to do it without duplicating the parsing logic (ideally the value cases from the existing code should Just Work in the new model) * There is no support right now for non-32-bit scalar types * It would be good if this support (and the shader cursor system) supported treating vectors like aggregates * The actual value-setting logic doesn't currently handle aggregates without field names, so `{ a:0, b:1 }` will work but `{ 0, 1 }` will parse but fail when it comes time to set values * While this approach lets complicated values be set with a single line, that isn't always what a user will want to do: in the future we should provide a way to break up an aggregate value over multiple lines that is consistent with this approach * Once we port all of the relvant tests over, it would be great to drop the `set` prefix and have these lines look as simple and conventional as possible	30 March 2021, 15:38:33 UTC
129faf8	Tim Foley	26 March 2021, 17:53:58 UTC	Append proper suffixes to 16-bit literals for GLSL (#1767) * Append proper suffixes to 16-bit literals for GLSL The GLSL output path wasn't putting suffixes on literals of 16-bit types, and that was leading to compilation errors in downstream `glslang`. This change adds the suffixes defined by `GL_EXT_shader_explicit_arithmetic_types`. This change also wraps up 8-bit literals so that they are emitted as, e.g., `int8_t(1)` instead of just `1`, to make sure we don't have implicit conversions in the output GLSL that weren't implicit in the Slang IR. We similarly wrap floating-point special values like infinities in their desired types when the type is `float` (e.g., `double(1.0 / 0.0)` for a double-precision infinity). Note: Standad IEEE 754 half-precision doesn't provide an encoding for infinite or not-a-number values, so it might be considered an error if we emit `half(1.0 / 0.0)` but there really isn't a significantly better alternative for us to emit. * fixup	26 March 2021, 17:53:58 UTC
abb020b	Tim Foley	25 March 2021, 23:40:17 UTC	Clean up render-test handling of input (#1766) The original goal of this change was to streamline the `TEST_INPUT` system by eliminating options that are no longer relevant once we have eliminated the non-shader-object execution paths. The result is more or less a re-implementation/refactor of the logic around how input is parsed and represented, that tries to set things up for a more general sytem going forward. The main changes isthat the `ShaderInputLayout` no longer tracks a simple flat list of `ShaderInputLayoutEntry` (that is a kind of pseudo-union of the various buffer/texture/value cases), and it instead uses a hierarchical representation composed of `RefObject`-derived classes to represent "values." There are several "simple" cases of values * Textures * Samplers * Uniform/ordinary data (`uniform`) * Buffers composed of uniform/ordinary data (`ubuffer`) Then there are composed/aggregate values that nest other values: * An aggregate value is a set of fields which are name/value pairs. It can be used to fill in a structure, for example. * An array value is a list of values for the elements of an array. It can be used to fill out an array-of-textures parameter, for example. * A combined texture/sampler value is a pair of a texture value and a sampler value (easy enough) * An object holds an optional type name for a shader object to allocate (it defaults to the type that is "under" the current shader cursor when binding), and a nested value that describes how to fill in the contents of that object Finally there are cases of values that are just syntactic sugar: * A `cbuffer` is just shorthand for creating an object value with a nested uniform/ordinary data value The big idea with this recursive structure is that it gives us a way to handle more arbitrary data types with name-based binding. Supporting this new capability requires changes to both how input layouts get parsed, and also how they get bound into shader objects. On the parsing side, things have been refactored a bit so that parsing isn't a single monolithic routine. The refactor also tries to make it so that the various options on an input item (e.g., the `size=...` option for textures) are only supported on the relevant type of entry (so you can't specify as many useless options that will be ignored). The bigger change to parsing is that it now supports a hierarchical structure, where certain input elements like `begin_array` can push a new "parent" value onto a stack, and subsequent `TEST_INPUT` lines will be parsed as children of that item until a matching `end` item. This approach means that we can now in principle describe arbitrary hierarchical structures as part of test input without endlessly increasing the complexity of invididual `TEST_INPUT` lines. On the binding side, we now have a central recursive operation called `assign(ShaderCursor, ShaderInputLayout::ValPtr)` that assigns from a parsed `ShaderInputLayout` value to a particular cursor. That operation can then recurse on the fields/elements/contents of whatever the cursor points to. Major open directions: * With this change it is still necessary to use `uniform` entries to set things like individual integers or `float`s and that is a little silly. It would be good to have some streamlines cases for setting individual scalar values. * Further, once we have a hierarchical representation of the values for `TEST_INPUT` lines, it becomes clear that we really ought to move to a format more like `TEST_INPUT: dstLocation = srcValue;` where `srcValue` is some kind of hierarchial expression grammar. Refactoring things in this way should make the binding logic even more clear and easy to understand. The refactored parser should make parsing hierarchical expressions easier to do in the future (even if it uses the push/pop model for now) * One detailed note is that the representation of buffers in this change is kind of a compromise. Just as an "object" value is a thin wrapper around a recursively-contained value for its "content" it seems clear that a buffer could be represented as a wrapper around a content value that could include hierarchical aggregates/objects instead of just flat binary data (this would be important for things like a buffer over a structure type that lays out different on different targets). The main problem right now with changing the representation is actually needing to compute the size of a buffer based on its content, so that can/should be addressed in a subsequent change. Details: * The base `RenderTestApp` class and the `ShaderObjectRenderTestApp` classes have been merged, since the hierarchy no longer serves any purpose. * Disabled the tess that rely on `StructuredBuffer<IWhatever>` because they aren't really supported by our current shader object implementation * Replaced used of `Uniform` and `root_constants` in `TEST_INPUT` lines with just `uniform` * Removed a bunch of uses of `stride` from `cbuffer` inputs, where it wasn't really correct/meaningful * Added the `copyBuffer()` operation to VK/D3D renderers, along with some missing `Usage` cases to support it. * Made `ShaderCursor` handle the logic to look up a name in the entry points of a root shader object, rather than just having that logic in `render-test`. (We probably need to make a clear design choice on this issue)	25 March 2021, 23:40:17 UTC
e050035	Yong He	25 March 2021, 16:41:53 UTC	Improve Vulkan shader-objects implementation. (#1765) * Improve Vulkan shader-objects implementation. 1. Null bindings no longer crashes. 2. No longer copies push constants to staging CPU buffer before setting it into command buffer. The entry-point shader object now directly sets it into command buffer upon `bindObject` call. * Update comments * Fix * Re-enable 3 tests. Improved vulkan implementation so that each shader object is responsible for creating descriptor sets on-demand. Fixed slang reflection to correctly report `ParameterBlock` binding. * Fix gcc compile error.	25 March 2021, 16:41:53 UTC
98afb42	Yong He	24 March 2021, 20:57:55 UTC	Reimplement Vulkan shader objects. (#1764) * Reimplement Vulkan shader objects. This change reimplements Vulkan shader objects in the `gfx` layer so that it is no longer layered on top of the `DescriptorSet` abstraction. Since this is the last implementation that uses `DescriptorSet`, the change also removes all `DescriptorSet` related API from public `gfx` interface. The Vulkan implementation now passes all test cases, but it still have two issues: 1. The PushConstant setting is not correct, this is because we don't seem to be able to get correct reflection data about the size of push constants for an entry-point. 2. The `shader-toy` example can't run on Vulkan, because it currently sets nullptr to `Texture` bindings, and this change doesn't properly handle setting resource to null in `ShaderObject`s yet. If we can use the `nullDescriptor` feature on vulkan, this implementation will be simple. However we still want to decide whether we want to use a Vulkan 1.2 feature for this. * Fix up	24 March 2021, 20:57:55 UTC
d0f7b7f	Yong He	22 March 2021, 23:33:51 UTC	`gfx` D3D12 shader objects rewrite. (#1763)	22 March 2021, 23:33:51 UTC
0f9b3a9	Yong He	18 March 2021, 20:19:58 UTC	Remove `DescriptorSet` from D3D11 and GL devices. (#1761)	18 March 2021, 20:19:58 UTC
6e5d85e	Tim Foley	17 March 2021, 19:55:30 UTC	Remove old code paths from render-test (#1760) * Remove old code paths from render-test Historically, the `render-test` tool was using three different code paths: * One based on `gfx` and manual (non-reflection-based) parameter setting, used for OpenGL, D3D11, D3D12, and Vulkan * One for CPU that used reflection-based parameter setting but shared no code with the first * One for CUDA that used reflection-based parameter setting and shared some, but not all, code with the CPU path Recently we've updated `render-test` to include a fourth option: * Using `gfx` and the "shader object" system it exposes for a unified reflection-based parameter-setting system taht works across OpenGL, D3D11, D3D12, Vulkan, CUDA, and CPU This change removes the first three options and leaves only the single unified path. A sa result, a bunch of code in `render-test` is no longer needed, and the codebase no longer relies on things like the `IDescriptorSet`-related APIs in `gfx`. Several existing tests had to be disabled to make this change possible. Those tests will need to be audited and either re-enabled once we fix issues in the shader object system, or permanently removed if they don't test stuff we intend to support in the long run (e.g., global-scope type parameters, which aren't a clear necessity). * fixup: CUDA detection logic	17 March 2021, 19:55:30 UTC
b64a23c	Tim Foley	16 March 2021, 22:27:34 UTC	Fix the "acceleration structure in compute" bug for GL_NV_ray_tracing too (#1759) A recent change broke code that uses `RayTracingAccelerationStructure` in non-RT shader stages for Vulkan/GLSL when also not doing any ray tracing in the shader code. A recent fix patched that up for code using `GL_EXT_ray_tracing` and/or `GL_EXT_ray_query`, but that fix didn't apply on the path that uses `GL_NV_ray_tracing` via an opt-in. This change fixes that gap and checks in a test for it.	16 March 2021, 22:27:34 UTC
210a988	Tim Foley	16 March 2021, 20:30:39 UTC	Update binaries (#1758)	16 March 2021, 20:30:39 UTC
6a360f7	Tim Foley	16 March 2021, 19:12:37 UTC	Enable building glslang from source (#1757) * Enable building glslang from source Somehow the slang-glslang binaries we are currently using aren't the most up-to-date ones, so I am enabling building glslang from source so that we can produce new binaries. * fixup: run generators	16 March 2021, 19:12:37 UTC
10b39e0	Yong He	15 March 2021, 19:59:58 UTC	Enable `gfx::CUDADevice` on linux. (#1756)	15 March 2021, 19:59:58 UTC
e428f6e	jsmall-nvidia	15 March 2021, 17:54:52 UTC	Preliminary docs on 'Doc System'. (#1755) * #include an absolute path didn't work - because paths were taken to always be relative. * First docs on 'doc system'. * Small improvements to doc system documentation. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	15 March 2021, 17:54:52 UTC
b6de9a0	jsmall-nvidia	15 March 2021, 16:48:20 UTC	Test Doc System (#1754) * #include an absolute path didn't work - because paths were taken to always be relative. * Use capability system in docs. Simplify how requirements/availability is produced. * Small fixes in output of availablity. * Updated stdlib doc. * Small improvements. * Added doc test type. Improved readability of straight .md text Made -doc option output to diagnostic stream. * Add test for checking requirements info is correctly extracted. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	15 March 2021, 16:48:20 UTC
d8150e7	Tim Foley	15 March 2021, 16:27:48 UTC	Fix handling of RT accelerations structures for non-RT stages (#1753) * Fix handling of RT accelerations structures for non-RT stages The recent change that added support for the `GL_EXT_ray_query` extension made is so that a shader that declares a `RaytracingAccelerationStructure` as an input to a non-RT shader stage but then never uses it wouldn't enable any RT extension, resulting in a compilation failure in glslang. This change reverts that behavior so that such shaders enable `GL_EXT_ray_tracing`, since that is the older of the two RT extensions that introduce `accelerationStructureEXT`. It is possible that we will need to revisit this decision based on which of the two extensions ends up being more broadly supported, but I think that right now it is fair to say that there exist drivers that support `GL_EXT_ray_tracing` but not `GL_EXT_ray_query`, so the former is the better default. * fixup: failing test	15 March 2021, 16:27:48 UTC
fd304c6	jsmall-nvidia	15 March 2021, 15:16:32 UTC	Improvements in Docs requirements/availability (#1751) * #include an absolute path didn't work - because paths were taken to always be relative. * Use capability system in docs. Simplify how requirements/availability is produced. * Small fixes in output of availablity. * Updated stdlib doc. * Small improvements. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	15 March 2021, 15:16:32 UTC
3d10d13	Yong He	12 March 2021, 21:13:49 UTC	Cleanup CPU renderer. (#1752)	12 March 2021, 21:13:49 UTC
d6a37a0	Tim Foley	12 March 2021, 19:58:14 UTC	Add a CPU renderer implementation (#1750) * Add a CPU renderer implementation This change adds a CPU back-end to `gfx` and ensures that most of our existing CPU tests pass when using it. Detailed notes: * Most of the CPU renderer implementation is copy-pasted from the CUDA case, so they share a lot of similar logic * The main addition to the CPU renderer is a semi-complete implementation of host-memory textures. The logic here handles all the main shapes (Buffer, 1D, 2D, 3D, Cube) and all the currently-supported `Format`s that are sample-able as-is (no D24S8). The implementation is not intended to be fast, and it currently only does nearest-neighbor sampling, but otherwise it tries to avoid cutting too many corners and should be ar reasonable starting point for a more complete (but not performance-oriented) implementation. * Refactored the CPU prelude `IRWTexture` interface to inherit from `ITexture`, since in most cases a single type will end up implementing both. It might be worth it to collapse it all down to a single interface later. * Changed the CPU prelude `ITexture`/`IRWTexture` interface so that it takes both a pointer and a size for output arguments. This change seems necessary to allow a shader variable declared as a `Texture2D<float>` to fetch a single `float` when the underlying texture might be using RGBA32F. * Added to the `IComponentType` public API so that we can query a "host callable" for an entry point and not just a binary. * Turned off the `-shaderobj` flag on two tests that weren't yet compatible with shader objects but still had the flag left in on the path (since previously the CPU path always used the non-`gfx` non-shader-object logic anyway) * Disabled one test (`dynamic-dispatch-11`) that relied on the `ConstantBuffer<IInterface>` idiom that we know we are planning to chagne soon anyway. * Made a few changes to the CUDA path to bring it into line with what I added for the CPU path. These were mostly bug fixes around indexing logic for sub-objects and resources. * fixup	12 March 2021, 19:58:14 UTC
9ffe2f3	jsmall-nvidia	11 March 2021, 22:56:03 UTC	MarkDown -> Markdown (#1748) * #include an absolute path didn't work - because paths were taken to always be relative. * MarkDown -> Markdown slang-doc-mark-down -> slang-doc-markdown-writer	11 March 2021, 22:56:03 UTC
5bcb342	jsmall-nvidia	11 March 2021, 22:08:08 UTC	stdlib documentation (#1745) * #include an absolute path didn't work - because paths were taken to always be relative. * Split out AST 'printing'. * Replace listener with List<Section> * Section -> Part. * Kind -> Type Flags -> Kind for ASTPrinter::Part * Improve comments around ASTPrinter. * toString -> toText on Val derived types. toText appends to a StringBuilder. * Added toSlice free function. Added operator<< for Val derived types. Use << where appropriate in doing toText. * More work at mark down output. * Fill in sourceloc for enum case. Add more sophisticated location determination for EnumCase. Refactored documentation output into DocMarkdownWriter. * Improvements for sig output. * Split up slang-doc into extractor and writer. * WIP generic support for doc support. * Some refactoring to make DocExtractor have potential to be used without Decls. * Made doc extraction work without Decls. * Output generic parameters. * Add generic parameter extraction. * Added writing variables. * Add an interface test. * Fix toArray. * Support for extensions, and inheritance. * Disable the doc test. * Added flags to compileStdLib. * More work around handling generics in markdown output. * More improvements around associated type handling. * List method names only once. Output in/out/inout/const * Fix namespace printing. * WIP summarizing doc output. * Small fixes and improvements for doc output. * Output all stdlib in single doc file. * Remove compile flags from addBuiltinSource. * Find only unique signatures. First pass at trying to get requirements. * First pass at requirements for stdlib docs. * Remove __ function/methods * Added Target Availability * Add markup access. Make sections of stdlib hidden. * MarkdownAccess -> Visibility Add isVisible methods Use ASTPrinter to print decl name. * Add current stdlib doc output. * Disable doc test for now. * Fix clang issue. * Don't use bullets and numbering , just use numbering. * Put methods in source order. * Fix bad-operator-call.slang test that fails because it now outputs out parameters as such. * Refactor MarkDownWriter to separate 'extraction' from output. * Fix typo around @ lines. * Fix issue with extracting 'before' when preceeded by complex attributes/modifiers. * Fix handling of generics with the same name. * Work around for having overloading with generics - we don't want to output generic params as part of name. * Remove generic paramters from name. * Simplify handling of outputting overridable names.	11 March 2021, 22:08:08 UTC
4b74f99	Tim Foley	11 March 2021, 21:08:21 UTC	Change representation of initial data for textures (#1747) * Change representation of initial data for textures Before this change, initial data for a texture has been provided with the `ITextureResource::Data` type, where a call to `IDevice::createTexture()` would take zero or one `Data` and, if present, use it to initialize all the subresources of a texture. The organization of `Data` was not actually quite how its own documentation comment described it (the implementations didn't agree with the comment), and while it aggressively factored out redundancies (e.g., only storing the stride for each mip level once, instead of once per subresource for large arrays), the result was that setting up a `Data` correcty was a bit confusing. This change makes the initial data for a texture using a `SubresourceData` type that is almost identical to what D3D11 uses, so that developers are more likely to be comfortable filling it in. All of the existing implementations were easily adapted to use the new type, so it seems like a net win. Note: Both Vulkan and D3D11 do away with the idea of initializing a texture with data as part of allocating it, and we might eventually want to do the same given the complexity that this system entails. The main reason to preserve this detail is for better compatibility with D3D11, where immutable textures/buffers need to have their data specified at creation time. It seems good to preserve the ability to have immutable resources on target APIs where this distinction could affect performance (e.g., immutable resources do not need state/transition tracking on APIs like D3D11). * fixup: CUDA	11 March 2021, 21:08:21 UTC
a07455c	Yong He	11 March 2021, 17:14:30 UTC	Add Linux support to `platform` and `gfx`. (#1744)	11 March 2021, 17:14:30 UTC
6cbd9d6	Tim Foley	10 March 2021, 23:18:06 UTC	A bunch of overlapping semantic-checking fixes (#1743) This change originally started with the simple goal of allowing generic functions with default argument values on their parameters to work: ``` void someFunction<T>(T value, int optional = 0); ``` The core problem there was that the compiler code was (correctly) anticipate the case where the default argument value for a parameter depends on a generic parameter, such as: ``` interface IDefaultable { static This getDefault(); } void anotherFunction<T : IDefaultable>(T first, T second = T.getDefault()); ``` Supporting this latter case requires some kind of ability to apply subsitutions to an `Expr`, but our compiler logic simply errored out in that case. The first major fix that went into this change was to add a new `SubstExpr<T>` type that behaves a lot like `DeclRef<T>` in that it stores a `T` plus a set of substititions that need to be applied to it. In addition, it was found that even if `anotherFunction<ConcreteType>(...)` might work, when generic argument inference was used for just `anotherFunction(...)` would fail because it includes a strict match on the number of arguments/parameters in the call expression. The next problem that arose was that the test I'd created used an interace with an `__init` requirement, and it appeared that our code generation didn't work for that case: ``` interface IStuff { __init(int val); } void f<T : IStuff>(T x = T(0)); ``` In this case, the `T(0)` initialization would get compiled to `(ConcreteType) 0` in the output rather than calling the function generated for the `__init` inside `ConcreteType`. The basic problem there was a bit of crufty old logic we have in place to work around the large number of `__init` declarations in the stdlib that don't have proper `__intrinsic_op` modifiers on them. We really need to fix the underlying problem there, but I worked around it by having the IR lowering pass only do its workaround magic on stdlib declarations. The next problem down this line was that my test had two different `__init` declarations in the concrete type and the logic for checking interface conformance was picking the wrong one to satisfying an interface requirement despite it being obviously wrong (not even the right number of parameter). This last problem led me down the rabbit-hole of trying to actually get our semantic checking for interface requirements right. There were a few pieces to that work: Actually checking that the parameter and result types for two callables match is the simple part. If that was all that would be required we would have implement this logic a long time ago. * Next we have to deal with functions that make use of the `This` type, associated types, etc. We have to know that when the interface uses `This`, we want to treat that as equivalent to `ConcreteType`, and similarly for associated types. Getting that working is mostly a matter of setting up a this-type subsitution for the interface member being checked. * Finally, when comparing generic declarations like `IBase::doThing<T>` and `Derived::doThing<U>` we need to deal with the way that `T` and `U` represent the "same" logical type parameter, but are distinct `Decl`s. This is handled by specializing the base declaration to the parameters of the derived one (e.g., forming `IBase::doThing<U>` using the `U` from `Derived::doThing`). The result seems to be passing our tests, but there are still a few gotchas lurking, I'm sure.	10 March 2021, 23:18:06 UTC
6ef4054	Yong He	10 March 2021, 18:58:15 UTC	Swapchain resize and rename to `IDevice` (#1741) * Swapchain resize * Fix.	10 March 2021, 18:58:15 UTC
2765861	Tim Foley	08 March 2021, 21:05:56 UTC	Add GLSL support for SV_InnerCoverage (#1740) This was a fairly straightforward addition once I found the correct GLSL extension spec to use.	08 March 2021, 21:05:56 UTC
fc9968d	Yong He	08 March 2021, 18:01:20 UTC	Refactor window library. (#1739) * Refactor window library. * Fix project file * Fix warnings.	08 March 2021, 18:01:20 UTC
95ca939	Yong He	08 March 2021, 03:31:08 UTC	Bug fix in window creation. (#1738)	08 March 2021, 03:31:08 UTC
e962f1a	Tim Foley	05 March 2021, 23:02:44 UTC	Add Vulkan/SPIR-V support for TraceRayInline() (#1737) For the most part, this translation is straightforward because the `GL_EXT_ray_query` extension is well aligned with the DXR 1.1 `RayQuery` feature. Many function map one-to-one from one extension to the other. A few notable details: * The equivalent of the `RayQuery<Flags>` type is non-generic in GLSL, and the GLSL path previously didn't have support for trying to look up an intrinsic type name on an IR type declaration, so that required some tweaks to the emit logic. * All the GLSL functions are free functions instead of member functions, but our IR doesn't recognize that distinction anyway * The main `TraceRayInline()` call is the one that took the most tweaking, just because it takes a `RayDesc` structure for D3D/HLSL but takes individual vector sand scalars for VK/GLSL. The approach here is a standard one for how we manage this stuff in the stdlib (and I wanted to avoid adding even more `$` magic for intrinsics). * For several other calls, the HLSL API had distinct `Candidate*()` and `Committed()` calls that return information about a candidate hit vs. the one committed into the query. In contrast, the GLSL API uses a single call that takes an additional "must be compile-time constant" `bool` parameter to select between the two behaviors. This is even the case for one call that basically returns a value of a different `enum` type depending on the state of that `bool`. The D3D API model here seems almost strictly better and I have no idea why the GLSL extension was defined this way. Because both the `GL_EXT_ray_query` and `GL_EXT_ray_tracing` extensions declare the `accelerationStructureEXT` type, we can no longer infer what extension is supposed to be used based only on the presene of such a type. The logic right now is a bit slippery, because in theory a program that declares an acceleration structure but never traces into it could end up getting a compilation error now. We will have to see if that corner case comes up in practice. :( The one big detail that is looming after doing this work is that both the HLSL and GLSL exposures of ray queries are extremely "slippery" about the actual identity of queries (e.g., when is one query a copy of another, vs. just being a new variable that references the existing query). Somehow queries get their identity from the original declaration, and as such our "default constructor" approach to them seems semanticay correct, but the whole thing is kind of slippery at a foundational level and I don't know how to fix it with the API as defined. Oh well; just something to keep an eye on. Co-authored-by: Yong He <yonghe@outlook.com>	05 March 2021, 23:02:44 UTC
860d17b	jsmall-nvidia	05 March 2021, 19:34:46 UTC	Doc tooling improvements (#1734) * #include an absolute path didn't work - because paths were taken to always be relative. * Split out AST 'printing'. * Replace listener with List<Section> * Section -> Part. * Kind -> Type Flags -> Kind for ASTPrinter::Part * Improve comments around ASTPrinter. * toString -> toText on Val derived types. toText appends to a StringBuilder. * Added toSlice free function. Added operator<< for Val derived types. Use << where appropriate in doing toText. * More work at mark down output. * Fill in sourceloc for enum case. Add more sophisticated location determination for EnumCase. Refactored documentation output into DocMarkdownWriter. * Improvements for sig output. * Split up slang-doc into extractor and writer. * WIP generic support for doc support. * Some refactoring to make DocExtractor have potential to be used without Decls. * Made doc extraction work without Decls. * Output generic parameters. * Add generic parameter extraction. * Added writing variables. * Add an interface test. * Fix toArray. * Support for extensions, and inheritance. * Disable the doc test. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	05 March 2021, 19:34:46 UTC
dc71108	Yong He	05 March 2021, 18:58:08 UTC	Cache stdlib when creating global session. (#1736) * Cache stdlib when creating global session. * Fix * Fix	05 March 2021, 18:58:08 UTC
a5ac499	Yong He	05 March 2021, 00:25:58 UTC	Refactor `gfx` to surface `CommandBuffer` interface. (#1735) * Refactor `gfx` to surface `CommandBuffer` interface. * Fixes. * Fix code review issues, and make vulkan runnable on devices without VK_EXT_extended_dynamic_states. * Update solution files * Move out-of-date examples to examples/experimental Co-authored-by: Yong He <yhe@nvidia.com>	05 March 2021, 00:25:58 UTC
13ff0bd	Tim Foley	03 March 2021, 19:45:39 UTC	Add GLSL/SPIR-V support got GetAttributeAtVertex (#1733) This change allows varying fragment shader inputs to be declared in a way that allows the `GetAttributeAtVertex` operation to compile to valid code for both D3D and GLSL/SPIR-V/Vulkan. The key is that rather than just use ordinary `nointerpolation`-qualified inputs the code must declare these varying inputs with a new `pervertex` qualifier that marks them as only being usable with `GetAttributeAtVertex`. The `pervertex`-tagged inputs then translate to GLSL inputs using the `pervertexNV` qualifier Note that this change does not include any enforcement of the requirements around how these qualifiers are used (and the compiler doesn't have enforcement for the existing operations like `EvaluateAttributeAtCentroid`). The underlying problem is that the inerpolation-mode qualifiers and explicit interpolation functions in HLSL constitute a kind of rate-qualified type system, but without any systematic rules. It seems wasteful to encode a bunch of ad hoc rules for this stuff as special cases in the compiler when the clear right answer is to implement a systematic approach to rates.	03 March 2021, 19:45:39 UTC
d6ae671	Tim Foley	02 March 2021, 23:46:28 UTC	Clean up declarator handling during source emit (#1732) This change tidies up some code related to the handling of declarators for the purpose of "unparsing" types into C-like declarations. The big change is that the `EDeclarator` type is changed to `DeclaratorInfo` and now has a bit of a subtype hierarchy under it rather than just using a `union`. The declarations have been moved to the header for CLikeSourceEmitter` so that they can be used by subclasses. I also removed the `IRDeclaratorInfo` type that was being declared but never actually used, and moved the case for pointers from that type into the main `EDeclarator`/`DeclaratorInfo`.	02 March 2021, 23:46:28 UTC
c2653ba	jsmall-nvidia	02 March 2021, 22:03:16 UTC	Fix issue with long identifier names in GLSL output (#1731) * #include an absolute path didn't work - because paths were taken to always be relative. * First pass at handling 'names' that are too long in GLSL output. * Test to check functionality with very long func name. * Add access a long names buffer. * Fix typo in assert. Fix issue with coercion error for 1.0f / 0x7fffffff Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	02 March 2021, 22:03:16 UTC
b81e8d4	Tim Foley	02 March 2021, 20:52:34 UTC	Add command-line control over SPIR-V version (#1730) * Add command-line control over SPIR-V version By default the Slang compiler policy is usually to produce output with the fewest dependencies possible. If input code can be encoded as SPIR-V 1.0, that is what we will use by default. The catch here is that in some cases later SPIR-V versions introduced improvements to the encoding that can affect performance (e.g., around large global arrays of constants), so that a user might explicitly want to require a newer SPIR-V version (restricting the driver versions their code can work on) in the hopes of seeing better performance. This change uses the system of capabilities that was previously introduced so that an option like `-profile glsl_450+spirv_1_5` can be used to explicitly request a specific SPIR-V version. Consistent with the existing implementation, the requested version will be taken as a minimum, and the final version might be higher based on other requirements (e.g., use of intrinsic functions that require a higher version). The test case included here is a little iffy in terms of long-term maintanenace. It relies on having both a `.slang` file and a `.glsl` file that we compile with the same options and then compare the SPIR-V, but that means there is no direct testing that the output SPIR-V actually uses the necessary version. If we break the inference of SPIR-V versions for both the regular and pass-through paths at once, this test won't flag the problem. A better test is probably needed soon. This change only adds support for controlling the SPIR-V version via capabilities specified via the command line or API. It would be nice to a future change to allow something like `[require(spirv_1_5)]` to be added to an entry point function to allow the user to embed their expectation/requirement into the source code. * fixup: clang warning	02 March 2021, 20:52:34 UTC
837a155	jsmall-nvidia	01 March 2021, 20:37:46 UTC	Doc improvements (#1729) * #include an absolute path didn't work - because paths were taken to always be relative. * Split out AST 'printing'. * Replace listener with List<Section> * Section -> Part. * Kind -> Type Flags -> Kind for ASTPrinter::Part * Improve comments around ASTPrinter. * toString -> toText on Val derived types. toText appends to a StringBuilder. * Added toSlice free function. Added operator<< for Val derived types. Use << where appropriate in doing toText. * More work at mark down output. * Fill in sourceloc for enum case. Add more sophisticated location determination for EnumCase. Refactored documentation output into DocMarkdownWriter. * Improvements for sig output.	01 March 2021, 20:37:46 UTC
b3501ad	Tim Foley	26 February 2021, 17:43:03 UTC	Shader object specialization work-in-progress (#1728) * Shader object specialization work-in-progress The big change here is in the `setObject()` implementations, where we now take write the witness table ID and data for the value being assigned in both the CUDA and graphics-API paths (it is possible the code could be shared...). The logic for deciding whether a value "fits" in the existential value payload should actually be correct here, since it uses the reflection data. The other relevant change is that the logic for writing out the ordinary/uniform data for a shader object on the graphics-API path has been updated so that it only allocates the GPU buffer after it knows the specialized layout, and can thus allocate space for any extra parameter data that wasn't in the original layout but got added by specialization. There is some inactive code in place that tries to sketch how the implementation should handle writing the data of sub-objects for interface-type fields into the appropriate areas of the allocated buffer for a parent object, but that is stubbed out for now pending implementation of the relevant reflection information. This change also introduces logic in the graphics-API path to create a specialized layout for a shader object on-demand (so that it will only be created after the specialization arguments are known or can be inferred). The implementation needs to treat ordinary shader objects and root shader objects differently because the Slang API handles specialization differently for ordinary types vs. `IComponentType`s. Some notes and caveats: * The CUDA path doesn't need to compute specialized layouts the way the graphics-API path does because layout doesn't change based on specialization for that path (just as it won't for the CPU path) * This code just skips over the RTTI field in existential values because it seems that we currently aren't using it in generated code. * We are completely missing the logic for recursively writing the resource ranges of sub-objects bound to interface-type fields into the descriptor set(s) of the parent object. The missing link there is reflection API support, just as it is for filling in the ordinary/uniform data. We need a way to get the binding range offset (and binding array stride) for the "pending" data of a specialized interface-type field. * The logic for computing specialization arguments based on the shader objects bound to interface-type fields has a lot of holes. Some of the indexing math is flat-out incorrect, and it also doesn't make any attempt to handle sub-object ranges with more than one element in them. I tweaked some of the code there to make it more correct, but that doesn't mean it is actually correct at this point. * The logic for computing a specialized `IComponentType` for a `ProgramVars` in the graphics-API path seems to have a lot of overlap with `maybeSpecializeProgram()`, so we should look into ways to avoid the duplication over time. * clang error fix	26 February 2021, 17:43:03 UTC
af63ee4	Tim Foley	25 February 2021, 03:22:31 UTC	Partial fix for macro expasnion of token pastes (#1727) The underlying problem here requires that we have an object-like macro with an expansion that starts with a non-identifier token: ``` ``` Then we need a function-like macro that uses a token paste in a way that can expand to that object-like macro: ``` ``` Finally, for the specific case a user ran into, we need to invoke that function-like macro in the context of a preprocessor conditional expression: ``` // ... #error "unimplemented" ``` The way a problem manifest is that the preprocessor logic that handles conditional expressions tries to "peek" one token ahead and see what is coming, and while the peeking logic handles macro expansion it does not handle token pasting right now. That means that the peek operation sees `MY_FEATURE` and assumes that it is seeing an identifier in a preprocessor conditional that doesn't have a macro expansion. The logic then goes on to read the token, but what it gets back is not an identifier, and is instead the numeric literal token `1`, because the reading logic handles token pasting. The quick fix I applied here is to make the logic that deals with preprocessor conditionals go ahead and automatically consume a token from the input, and then decide what to do based on that token, so that it always makes use of the reading logic that handles token pasting. The lingering problem is that we still have cases in the preprocessor that use the peeking logic which doesn't handle pasting, and we might find that those cases have reason to want the same kind of expansion behavior we needed here. A more systematic fix would be to have the peeking logic automatically handle token pasting as well as macro expansion, but doing so would be a more complicated change because detecting the `##` when peeking ahead requires two tokens of lookahead, and our current implementation only assumes we can support one. Co-authored-by: Yong He <yonghe@outlook.com>	25 February 2021, 03:22:31 UTC
9b7a007	Yong He	24 February 2021, 23:43:43 UTC	Explicit swapchain interface in `gfx`. (#1726) * Explicit swapchain interface in `gfx`. * Correctly return nullptr when `IRenderer` creation failed. * Fix crashes on CUDA tests. * Cleanups.	24 February 2021, 23:43:43 UTC
d66b307	Tim Foley	24 February 2021, 16:21:37 UTC	Add support for GetAttributeAtVertex for D3D (#1725) This operation was added along with the `SV_Barycentrics` system-value input, and allows for a `nointerpolation` varying input to a fragment shader to be fetched at a specific vertex index within the primitive that is causing the fragment shader to be invoked. This change adds support for the new operations in the standard library, and also includes a test case to make sure that we emit it correctly when producing HLSL/DXIL. This change also includes a small bug fix to our emission logic for function parameters so that we properly emit layout-related attributes for varying parameters declared directly on an entry point. (Note that most attribute end up being declared in `struct` types in existing HLSL shaders, and our IR passes produce only global variables for attributes on GLSL; the only case this affects is inidividual scalar/vector attributes declared declared as entry-point parameters, when outputting HLSL) Note that this change only adds support for the new function on the HLSL/DXIL path, and doesn't yet add any cross-compilation support for GLSL/SPIR-V. The reason for this is that the equivalent GLSL feature(s) appear to use a different model to the HLSL version, and we need to invent a suitable approach to align them to make portable code possible.	24 February 2021, 16:21:37 UTC
55a5ccc	jsmall-nvidia	23 February 2021, 17:36:46 UTC	Documentation markup extraction (#1724) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP extracting source documentation. * WIP doc extraction. * More stuff around doc markup extraction. * More WIP around doc extraction. * Fix some indexing issues. * Initial doc extraction working. * Renaming of types in markup extraction process. * Extracting markup content. Removing indenting. Other fixes and improvements around document tools. * WIP support for documentation system. * Remove some commented out sections. * Remove some comments that no longer apply. * Improvements around SourceFile - such that more granularity around line ops. Made some functionality explicitly work without source. Improved Doc types nameing.	23 February 2021, 17:36:46 UTC
4bf01b0	Tim Foley	23 February 2021, 09:47:19 UTC	Some ad hoc parser fixes (#1723) The `AdvanceIfMatch()` method was introduced to the parser as a way to avoid infinite loops when parsing nested list structures (e.g., `()`-enclosed parameter lists). The basic idea is that it tries to detect if we have scanned "too far" looking for a closing token, and reports a match to whatever logic was doing the looping to break the statemate. Unfortunately, the `TryRecoverBefore` logic was changed at some point so that it doesn't necessarily advance any tokens at all, because we generally don't want to skip over a `}` while searching for a `)`. As a result, we could still end up in an infinite loop where we didn't consume any additional tokens as part of recovery, but wouldn't bail out of the search for a match. This change tries to introduce a slightly more systematic setup where `AdvanceIfMatch` is now parameterized on a type of matched token pair (not just the closing token), and each such matched token pair introduces a list of tokens where if we see them as our lookahead we should bail out (e.g., when looking for a `)` we should give up the search upon seeing a `}`). After installing that fix I found that my simple test case still gave a surprising error because when mistakenly parsing a function body the parser would look for a `{` and then a `}` to close the body. The search for a closing `}` could accidentally consume a `}` meant for an outer scope, and lead to a cascading failure. I madea quick fix to the parsing of block statements so that we don't look for a closing `}` if we never had an opening `{`, but that isn't really a systematic solution like we truly need. For now, these fixes will avoid the infinite-loop case, and should give a better diagnostic in the case a user ran into, but we need to take time to do some more top-down work on the parser sooner or later.	23 February 2021, 09:47:19 UTC
025c0ed	Tim Foley	22 February 2021, 23:07:26 UTC	Add basic support for fragment shader interlock (FSI) (#1722) Both D3D "rasterizer ordered views" (ROVs) and GLSL "fragment shader interlock" (FSI) are aimed at the same basic use case: they allow for fragment shaders to contain operations that require mutual exclusion and/or deterinistics ordering between fragment shader invocations that affect the same framebuffer coordinates. The language-level exposure of the features varies greatly between the two API families, though: * ROVs define an implicit ordering and mutual exclusion constraint: certain resoure parameters are marked as `RasterizerOrdered`, and reads/writes to these resources must be sequences as if fragment-shader invocations ran in sequential order for each pixel. * FSI defines paired begin/end functions that mark a critical section of code. All memory operations in the critical section must be sequences as if fragment-shader invocations ran in sequential order for each pixel. In order to make this model tractable, only a single critical section is allowed per fragment shader, and the begin/end must appear at the top level of the shader entry point function (not under control flow or after a possible conditional `return`. The simplest way for Slang to support portable programs that run across both API families is to insist that code that cares about these ordering guarantees must use both mechanisms, and then each of them will only affect the API that cares about it. Slang already supports ROV resource types, and already lowers them to plain textures for GLSL/SPIR-V. This change adds the missing feature of a begin/end function pair for FSI, which will map to empty functions on non-GLSL targets.	22 February 2021, 23:07:26 UTC
e1e4220	Tim Foley	19 February 2021, 18:32:19 UTC	Add a chapter on target platforms (#1720) * Add a chapter on target platforms The primary goals of this chapter are: * Make users aware of just how many different ways of handling things there are across targets. If a user leaves this chapter thinking "how in the world can you abstract over all these differences?", then we have done our job, because they are primed to understand why layout and parameter binding are necessarily complicated. * Help users to understand/recall the relevant capabilities and restrictions of the platforms they care about most. If somebody only cares about D3D12 and Vulkan, I want them to leave with a detailed understanding of how those two differ so they can understand the specifics of where the layout and parameter-binding algorithms have to treat those targets differently. All of this could conceptually be just a background section in the layout and parameter-binding chapter, but putting it off in its own chapter avoids that one taking forever to actually get where it is going. * Typos	19 February 2021, 18:32:19 UTC
5f7dc28	Yong He	19 February 2021, 18:11:01 UTC	Make gfx library visible to external user. (#1719) * Make gfx library visible to external user. * Fixup	19 February 2021, 18:11:01 UTC
22fe1df	Yong He	18 February 2021, 02:46:14 UTC	Fix typo in user guide.	18 February 2021, 02:46:14 UTC
b1e376f	Tim Foley	18 February 2021, 02:42:23 UTC	Streamline shader object creation (#1717) This change kind of rolls together two different simplifications: 1. The `createShaderObject()` shouldn't really need to take an `IShaderObjectLayout` because it could just take the `slang::TypeLayoutReflection` instead and create the shader-object layout behind the scenes. 2. For that matter, it needn't take a `slang::TypeLayoutReflection` either, becaues it could just take a `slang::TypeReflection` and query the layout of that type behind the scenes. The combination of these two changes means: * `IShaderObjectLayout` is gone from the public API, as is `createShaderObjectLayout()` * `createShaderObject()` directly takes a `slang::TypeReflection` and allocates a shader object of that type The result is simpler and more streamlined application code. Note that under the hood the implementation still has shader-object layouts, using the `ShaderObjectLayoutBase` class. A few locations had to change to use `RefPtr`s instead of `ComPtr`s now that the class is no longer a public COM-lite API type. The hope is that this change makes it easier to allocate/cache layouts for things like specialized types "under the hood," as is needed to implement parameter setting for static specialization.	18 February 2021, 02:42:23 UTC
bdb0c0b	Yong He	18 February 2021, 01:41:57 UTC	Further documentation on Slang specific features (#1716) Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	18 February 2021, 01:41:57 UTC
62a0193	Tim Foley	18 February 2021, 00:53:17 UTC	Use CPU memory for shader object ordinary data (#1714) This change makes it so that the shared shader object implementation across graphics APIs (everything except CUDA and CPU) uses a host-memory buffer to store ordinary (aka "uniform") data while the shader object is being set up / modified, and then allocates and initializes a GPU-memory buffer for the data on-demand once setup is complete. This choice is a necessary step for supporting interface/existential-type fields in the presence of static specialization, because any fixed-size GPU buffer we would try to allocate at the time an object is first created might not turn out to be large enough if static specialization must handle a concrete type that doesn't "fit" into the fixed-size space reserved for an existential value (resulting in the value having to be placed in an overflow region outside the original object). This change does not include any of the work related to actually laying out existential-type fields in this fashion. It instead just focuses on changing when and where the GPU memory allocation is performed to one that is more appropriate for those subsequent changes.	18 February 2021, 00:53:17 UTC
360d4f7	jsmall-nvidia	18 February 2021, 00:04:48 UTC	More #line improvements (#1713) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP: First pass in supporting output of line error information. * Add support for lexing to better be able to indicate SourceLocation information. * Fix lexer usage in DiagnosticSink in C++ extractor. * Update diagnostics tests to have line location info. * Fixed test expected output that now have source location information in them. * Better handling of tab. * Fix test expected results for tabbing change. * DiagnosticLexer -> DiagnosticSink::SourceLocationLexer Added line continuation tests. * Fix typo. * Added String::appendRepeatedChar * Change to rerun tests. * Added source locations to IR dumping. * Output column for IR dump source loc. * Add support for closing brace location to AST. Use closing brace location in lowering when adding return void. * Set the source location through SourceLoc - simplifies identifying if current loc is valid. * Copy terminator sloc. * Test for improved #line handling. * Made writer the last parameter for dumpIR. Small improvements to comments. * Disable sloc output on dump IR by default. * Fix issue with #line and inlining. * Fix for output with improved #line output. * Small comment change - mainly to kick off TC build. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	18 February 2021, 00:04:48 UTC
e59aee1	Yong He	17 February 2021, 23:09:09 UTC	Add `SampleGrad` overload for lod clamp. (#1711) * Add `SampleGrad` overload for lod clamp. * Fix gfx to run the test on vulkan. * Whitespace change to trigger CI build * remove presentFrame call in render-test Co-authored-by: Yong He <yhe@nvidia.com> Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	17 February 2021, 23:09:09 UTC
39975b2	Tim Foley	16 February 2021, 22:03:39 UTC	Fixes to get shader-object example working on CUDA (#1708) The purpose of these changes is to make the `shader-object` example work correctly on CUDA. Originally I had tried to add changes to the "flat" reflection information so that it introduced descriptor ranges to match the binding ranges it added for interface/existential-type fields. This approach helped the CUDA code that was using that information to try and compute uniform offsets for those fields, but it broke most of the other renderer back-ends. Instead, I removed the relevant asserts from `CUDAShaderObject::setObject()`. Note taht there are leftover changes from my edits to the flat reflection information, around how it handles "leaf" fields that consume multiple resource kinds. I believe that those changes are, on balance, "more correct" now than they were before, so I decided to leave them in. The other major fix here is to specialize the `CUDAShaderObject::setObject()` logic to handle the case of setting a shader object for a parameter that has interface type instead of a constant-buffer or parameter block. Mostly I just copy bytes from the child object into the parent object. There are a few caveats, though: * I am not writing the RTTI or witness-table information, so dynamic dispatch won't work. * I am assuming a hard-coded offset of 16 bytes for the any-value, which will work for now but is a bit too "magical" and might also break once we support conjunctions of interfaces with dynamic dispatch * I am assuming that the child value to be writen into the field will "fit" into the any-value area. We need some way to determine whether or not things fit dynamically (ideally using the reflection data), and adapt accordingly. * I had to add another method on the base CUDA shader object type to handle setting data using a device-memory pointr instead of a host-memory pointer * There's not a lot we can do about it, but in the case of assigning an ordinary `CUDAShaderObject` into an interface-type field of a `CUDAEntryPointShaderObject` we end up needing to perform a device->host memory copy, because the bytes of the value will have already been written to GPU memory, but need to be in GPU memory for the dispatch call. * The implementation I'm using here basically assumes that the child shader object must have been finalized before it gets plugged into the parent shader object. We haven't yet made a policy decision about that bit.	16 February 2021, 22:03:39 UTC
e474c4e	Tim Foley	16 February 2021, 19:48:21 UTC	Add an accessor for IRInst opcode (#1707) * Add an accessor for IRInst opcode This main changing is renaming `IRInst::op` over to `IRInst::m_op` and then adds an accessor `IRInst::getOp()` to read it. The rest of the changes are just changing use sites to `getOp` (or to `m_op` in the limited cases where we write to it). This work is in anticipation of a future change that might need to store an extra bit in the same field as the opcode. It seemed better to do this massive refactoring as a separate PR. * fixup	16 February 2021, 19:48:21 UTC
5777545	Yong He	12 February 2021, 23:01:45 UTC	Add associated type and generic value parameter doc section (#1706) * Add associated type and generic value parameter doc section * Typos and corrections.	12 February 2021, 23:01:45 UTC
e2096cf	Tim Foley	12 February 2021, 21:48:11 UTC	Initial support for DXR payload access qualifiers (#1705) This change adds initial support for a feature being proposed for inclusion in dxc: https://github.com/microsoft/DirectXShaderCompiler/pull/3171. The main features are: * A `[payload]` attribute that indicates which `struct` types are intended to be used as payloads. Consistent use of this attribute should mean that an application no longer needs to manually specify a maximum payload size when creating a ray-tracing pipeline. * `read(...)` and `write(...)` qualifiers which can be attached to fields of `struct` types (usually `[payload]`-attributed types) to indicate which ray tracing pipeline stages are allowed read/write access to that part of the payload. Use of these qualifiers should allow an implementation to optimize storage of ray payload elements across RT pipeline stages. The work in this change just adds basic parsing for these features, translation to matching IR decorations, and then emission of HLSL text based on those decorations. Notable gaps in this first change include: * No work is currently being done to validate access to ray payloads in RT entry points based on these qualifiers. * The stage names in `read(...)` and `write(...)` are not being validated, and are being stored in the IR as text. These should probably use the `Stage` enumeration in some fashion, but we would need to have a way to encode the additional `caller` pseudo-stage that the feature uses. * No work is currently being done to adjust or react to the chosen shader model when emitting HLSL code. We should either have these attributes force a switch to a higher shader model, or skip emission of these attributes if the chosen shader model / profile does not imply support for them. * No tests are currently included for this work, because tests would rely on using a custom `dxcompiler.dll` build with the new feature supported.	12 February 2021, 21:48:11 UTC
0dea127	Yong He	12 February 2021, 20:54:48 UTC	First part of interfaces and generics doc. (#1704) Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	12 February 2021, 20:54:48 UTC
0befc84	Tim Foley	12 February 2021, 20:53:56 UTC	Further documentation work (#1703) * Move around the conventional/convenience features chapters * Add a first draft of a section on compilation using `slangc` and the COM-lite API Co-authored-by: Yong He <yonghe@outlook.com>	12 February 2021, 20:53:56 UTC
a2401a6	Yong He	12 February 2021, 20:20:17 UTC	Support `bit_cast` between complex types. (#1702) * Support `bit_cast` between complex types. * Fix vs project file * Fix clang build error * fix * fix * Fix * FIx * Fix * Fix * Fix * Fix * Fix linux compile error Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	12 February 2021, 20:20:17 UTC
369279e	jsmall-nvidia	12 February 2021, 19:31:56 UTC	Diagnostic location highlighting (#1700) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP: First pass in supporting output of line error information. * Add support for lexing to better be able to indicate SourceLocation information. * Fix lexer usage in DiagnosticSink in C++ extractor. * Update diagnostics tests to have line location info. * Fixed test expected output that now have source location information in them. * Better handling of tab. * Fix test expected results for tabbing change. * DiagnosticLexer -> DiagnosticSink::SourceLocationLexer Added line continuation tests. * Fix typo. * Added String::appendRepeatedChar * Change to rerun tests. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	12 February 2021, 19:31:56 UTC
cd79bfb	Yong He	11 February 2021, 18:09:20 UTC	Add convenience features chapter in user-guide doc (#1699) * Fix getting started doc * Add convenience features chapter in user-guide doc Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	11 February 2021, 18:09:20 UTC
e1b1ce3	Tim Foley	11 February 2021, 00:35:52 UTC	Fix a bug in IR lowering (#1701) The underlying problem here is that our `SharedIRBuilder` (which currently owns the "global" value-numbering map) has a subtle invariant ("subtle" in the sense of "dangerous and bad"). The value-numbering map stores `IRInst`s for things like constants and types, and if those instructions end up getting modified or deleted (deleting an instruction currently runs its destructor but does not free the pool-allocated memory), then it is possible for the computed hash code for an instruction to no longer match what it was when it was inserted. The trigger in this case was a use of the `IRInst::removeAndDeallocate()` operation inside of the AST-to-IR lowering pass, which uses a single `SharedIRBuilder`. If that `removeAndDeallocate()` happens to apply to a value in the value-numbering map, then it risks breaking the next time the map gets rehashed. The short-term fix here is simple: never try to delete an instruction during IR lowering, even if it is known to be unused. Instead, we can rely on the subsequent DCE pass to eliminate the instruction. A longer-term fix here would involve fixing our entire strategy around value numbering. We know we need to do that, but that would be a big enough change that it couldn't be pursued as part of a simple bug fix like this.	11 February 2021, 00:35:52 UTC
8750a7c	Tim Foley	10 February 2021, 02:16:28 UTC	Add more to User's Guide (#1698) This change adds a first draft of an Introduction chapter, along with a chapter about the "conventional" features of Slang (when compared to HLSL, GLSL, and C/C++).	10 February 2021, 02:16:28 UTC
03f6389	Yong He	09 February 2021, 16:40:27 UTC	Add getting started documentation (#1697) * Add getting started documentation * wording * wording	09 February 2021, 16:40:27 UTC
53ff724	jsmall-nvidia	08 February 2021, 22:53:02 UTC	Hotfix/doc typo lexical (#1696) * #include an absolute path didn't work - because paths were taken to always be relative. * Fix typo	08 February 2021, 22:53:02 UTC
10a55d8	jsmall-nvidia	08 February 2021, 22:49:45 UTC	DX12 & NVAPI fixes (#1695) * #include an absolute path didn't work - because paths were taken to always be relative. * Fix bugs with m_features on Dx12 and gl. Fix issue about GFX_NVAPI availability. * Fix handling of SLANG_E_NOT_AVAILABLE on renderer startup. * Clarify comment. * Improve comment.	08 February 2021, 22:49:45 UTC
891791e	jsmall-nvidia	08 February 2021, 21:29:31 UTC	Copy SourceLoc when inlining (#1692) * #include an absolute path didn't work - because paths were taken to always be relative. * Copy source loc information when inlining.	08 February 2021, 21:29:31 UTC
df7548e	Yong He	05 February 2021, 22:36:07 UTC	Shader-Object example (#1694)	05 February 2021, 22:36:07 UTC
5fbaccf	jsmall-nvidia	05 February 2021, 19:59:46 UTC	Typo in renderer name for DX12 (#1693) * #include an absolute path didn't work - because paths were taken to always be relative. * Typo for renderer name for DX12.	05 February 2021, 19:59:46 UTC

Newer
Older