Revision history - None - origin: https://github.com/shader-slang/slang

visit type:

Revision	Author	Date	Message	Commit Date
c6fd4a5	Yong He	19 January 2021, 17:10:15 UTC	Make `ShaderCursor` no longer depend on core. (#1661)	19 January 2021, 17:10:15 UTC
1296c7b	Yong He	18 January 2021, 06:00:49 UTC	Make `gfx` compile to a DLL. (#1660) * Make `gfx` compile to a DLL. * Fix cuda * Fix cuda build * Bug gl screen capture bug.	18 January 2021, 06:00:49 UTC
2a5d5b3	Tim Foley	15 January 2021, 20:10:06 UTC	Convert more tests to use shader objects (#1659) This change converts a large number of our existing tests to use the `ShaderObject` support that was added to the `gfx` layer. In many cases, tests were just updated to pass `-shaderobj` and the result Just Worked. In other cases, a `name` attribute had to be added to one or more `TEST_INPUT` lines. For tests that did not work with shader objects "out of the box," I spent a little bit of time trying to get them work, but fell back to letting those tests run in the older mode. Future changes to the infrastructure will be needed to get those additional tests working in the new path. Along with the changes to test files, the following implementation changes were made to get additional tests working: * Because the shader object mode uses explicit register bindings (from reflection), the hacky logic that was offseting `u` registers for D3D12 based on the number of render targets gets disabled (by another hack). * The "flat" reflection information coming from Slang was not correctly reporting "binding ranges" for things that consumed only uniform data (which would be everything on CUDA/CPU), so it was refactored to properly include binding ranges for anything where the type of the field/variable implied a binding range should be created (even if the `LayoutResourceKind` was `::Uniform`). * A few fixes were made to the CUDA implementation of `Renderer`, in order to get additional tests up and running. Most of these changes had to do with texture bindings, which hadn't really been tested previously. In addition, a few changes were made that were attempts at getting more tests working, but didn't actually help. These could be dropped if requested: * As a quality-of-life feature (not being used) the `object` style of `TEST_INPUT` line is upgraded to support inferring the type to use from the type of the input being set. * Any `object` shader input lines get ignored in non-shader-object mode.	15 January 2021, 20:10:06 UTC
f834f25	Yong He	14 January 2021, 23:48:54 UTC	COM-ify all slang-gfx interfaces. (#1656) * COM-ify all slang-gfx interfaces.	14 January 2021, 23:48:54 UTC
ac76997	jsmall-nvidia	14 January 2021, 23:03:51 UTC	Adding missing VisualStudio lz4 project (#1657) * #include an absolute path didn't work - because paths were taken to always be relative. * Added missing lz4 visual studio project.	14 January 2021, 23:03:51 UTC
723796a	jsmall-nvidia	11 January 2021, 20:24:11 UTC	LZ4 compression support (#1654) * #include an absolute path didn't work - because paths were taken to always be relative. * Testing out use of lz4. * Added ICompressionSystem, and LZ4 implementation. * Add support for deflate compression. Simplify compression interface - to make more easily work across apis. * WIP on CompressedFileSystem. * ImplicitDirectoryCollector * SubStringIndexMap - > StringSliceIndexMap. * WIP save stdlib in different containers. * Support for different archive types for stdlib. * Fix project. * CompressedFileSystem -> ArchiveFileSystem. Added CompressionSystemType::None * Added ArchiveFileSystem * Fix problem RiffFileSystem load withoug compression system. * Test archive types. Improve diagnostic message. * Fix typo in testing file system archives. * Split out archive detection. * Fix gcc warning issue. * Fix warning. * RiffArchiveFileSystem -> RiffFileSystem Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	11 January 2021, 20:24:11 UTC
5554777	Yong He	11 January 2021, 17:11:52 UTC	Make `gfx::Renderer` a COM interface. (#1653) * Make `gfx::Renderer` a COM interface. This is a first step towards making the `gfx` library expose a COM compatible DLL interface. Remaining classes will come as separate PRs. * Fixup project files * Fix calling conventions * Make gfx::createRenderer() functions increase ref count by 1 Make renderer createFunc return via out parameter	11 January 2021, 17:11:52 UTC
e24c5a6	Tim Foley	08 January 2021, 00:01:48 UTC	Fill in some missing bits of capability API (#1652) * Fill in some missing bits of capability API * Make invalid/unknown capability have a zero value (this aligns it better with the public API for `SlangProfileID`, so that the two can be merged down the line) * Actually provide an implementation of `spFindCapability()` public API * fixup: bug fixes for renumbering invalid capability atom	08 January 2021, 00:01:48 UTC
d84f458	Tim Foley	07 January 2021, 21:16:18 UTC	Add a -capability command-line option (#1651) This provides a stand-alone option distinct from `-profile` that can be used to add capabilities to a target. A test has been added to confirm that `-profile X -capability Y` works the same as `-profile X+Y`. The intention is that this option could be used in applications that use the API to set up their target but then use the options-parsing logic to handle things like capabilities. Note: that latter bit has not been confirmed, so it is possible that this approach does not actually suffice for hybrid API + options usage. That will need to be confirmed in follow-up work.	07 January 2021, 21:16:18 UTC
66d4466	Tim Foley	07 January 2021, 20:18:55 UTC	Add support for [noinline] attribute (#1650) This adds the `[noinline]` attribute to the front-end, and passes it through when generating HLSL output. Notes: * This change doesn't include a test since the dxc version I have locally parses `[noinline]` but then generates DXIL that fails validation. * This change doesn't include logic to handle `[noinline]` for other targets. Notably, SPIR-V has decorations that convey the same intention, but we don't yet take advantage of the GLSL extension(s) that would let us generate those decorations. * By necesstiy, `[noinline]` is only a "strong suggestion" and not actually something the compiler can ever guarantee/enforce.	07 January 2021, 20:18:55 UTC
9263651	Yong He	06 January 2021, 20:58:57 UTC	Refactor GUI/Window utils out of gfx library (#1649) Co-authored-by: Yong He <yhe@nvidia.com>	06 January 2021, 20:58:57 UTC
706d4f9	Tim Foley	05 January 2021, 21:01:17 UTC	Add basic GLSL support for SV_Barycentrics (#1648) * Add basic GLSL support for SV_Barycentrics This change allows for fragment shader varying inputs marked with the `SV_Barycentrics` semantic to be mapped to GLSL code using the `gl_BaryCoordNV` builtin variable (from he `GL_NV_fragment_shader_barycentric` extension). This is the simplest possible change to get the functionality up and running, and it leaves out many things that could be desired in a more feature-complete version of the feature later: * There is no support for alternative extensions that provide similar functionality. Selection of which extension to favor could eventually be based on the "capability" work that has been put in place. * There is no attempt made to check that the input has the expected type (or to coerce it if it doesn't), so for now this is only going to be guaranteed to work for a `float3` input. * This change does not expose the `pervertexNV` qualifier added in the `GL_NV_fragment_shader_barycentric` extension, which can be used by a shader to access the uninterpolated vertex inputs. The last issue is an important one, since the HLSL `GetAttributeAtVertex` function seems to be defiend to work with any incoming varying parameter that was marked with `nointerpolation`. When we have a `nointerpolation` input, it would seem that we need to know whether it will be used with `GetAttributeAtVertex` (in which case it should be declared as a `pervertexNV` array input in GLSL) or not (in which case it should be declared as a `nointerpolation` input, without an array). * fixup: missing file	05 January 2021, 21:01:17 UTC
b4f9462	Tim Foley	05 January 2021, 17:00:00 UTC	Use "capability" system to select VKRT extension (#1647) * Use "capability" system to select VKRT extension Slang currently supports translation of ray tracing shader code to Vulkan GLSL code that uses the `GL_NV_ray_tracing` extension. A multi-vendor equivalent of that extension has been released as `GL_EXT_ray_tracing` and we want Slang to support that extension as well. At the simplest, making the change from one extension to the other is just a matter of changing a few strings, since it does not appear that anything of significance was changed at the GLSL level (or even in SPIR-V). Where this gets trickier is when we have users who want us to support both extensions, and to be able to switch between them. The solution we've implemented here more or less amounts to: * If you don't tell the compiler which extension to use, it will default to `GL_EXT_ray_tracing` (the newer multi-vendor one). * If you explicitly want the older extension, you can opt into it using the `-profile` option or via a new API for explicitly adding capabilities to your target. Making that work required a few different kinds of changes: * The options parsing and public API needed ways to add optional capabilities to a target. * During GLSL code emit, we can check the capabilities that were added to the target to see if the `GL_NV_ray_tracing` extension was explicitly enabled and, if not, default to using the `GL_EXT_ray_tracing` names for things. This step is needed because some of the modifiers/attributes involved in the extension have to be handled explicitly in the code generator rather than implicitly as part of mapping intrinsic functions. * We add two different translations to the relevant operatiosn in the stdlib, one marked with each of the extensions. If profile/capability-based overload resolution can be relied on to pick the right one, this should Just Work. * Next, a bunch of work had to go into making capability-based overloading Just Work for the purposes of this change. There's been a nearly complete reworking of the implementation of `CapabilitySet` here to make it more suitable for our needs. * The tests that were using ray tracing translation for Vulkan needed to be updated. For some of them I updated their baselines to use `GL_EXT_ray_tracing` so that they can test the new path. For others, I updated the command line for the test case so that it explicitly opts into using `GL_NV_ray_tracing`. The result is that we have some coverage of each extension. I would have liked to have each test run in both modes, but our pass-through glslang support doesn't support `-D` options, so I couldn't take that step easily. This change does not add support for `GL_EXT_ray_query`, the extension that supports "DXR 1.1" style queries under Vulkan. Adding support for that extension should hopefully be a smaller step because it doesn't have the same multiple-extensions issue. This change does not address a lot of possible avenues for improvement or cleanup around the capability system. It focuses only on those changes that are necessary to make the ray tracing feature work and leaves the rest for future work. * fixup: infinite loop * Comment-only change to retrigger TC build	05 January 2021, 17:00:00 UTC
f5fffa9	Dietrich Geisler	18 December 2020, 17:10:10 UTC	Heterogeneous Flag Error Visibility (#1642) * PR to fix issue #1638. This change introduces a diagnostic sink to the emitModule function, and updates all associated calls to that function. Additionally, this commit updates the heterogeneous hello world example to not need the entry and stage flags for simplicity. * Updated emit-cpp per suggested changes Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	18 December 2020, 17:10:10 UTC
0fa3bcf	Yong He	15 December 2020, 20:57:55 UTC	Cleanup CUDA renderer. (#1644) * Cleanup CUDA renderer. * More cleanup * fixes. * update comments Co-authored-by: Yong He <yhe@nvidia.com>	15 December 2020, 20:57:55 UTC
77bc70e	jsmall-nvidia	15 December 2020, 19:04:10 UTC	OSX Build/glslang premake fix (#1641) * #include an absolute path didn't work - because paths were taken to always be relative. * Improve docs. Fix premake build of glslang. * More improvements to the building.md doc.	15 December 2020, 19:04:10 UTC
1bc7948	jsmall-nvidia	14 December 2020, 21:18:51 UTC	Enable embedding stdlib for github builds (#1640) * #include an absolute path didn't work - because paths were taken to always be relative. * Enable building with embedding stdlib.	14 December 2020, 21:18:51 UTC
856d7d3	Yong He	11 December 2020, 17:42:23 UTC	Implements CUDA renderer in gfx. (#1637) * Implements CUDA renderer in gfx. * Revert unnecessary change. * Revert unnecessary changes. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	11 December 2020, 17:42:23 UTC
992778e	Tim Foley	11 December 2020, 16:50:43 UTC	Add first steps toward a "capability" system (#1636) * Add first steps toward a "capability" system We already have cases in the stdlib where we mark declarations as being specific to certain targets, e.g.: ``` // My ordinary function to add two numbers. // Works everywhere. // void myFunc(int a, int b) { return a + b; } // On the "coolgpu" target, we can use a secret intrinsic // that adds numbers even faster! // __specialized_for_target(coolgpu) void myFunc(int a, int b) { return __secretIntrinsic(a, b); } ``` The existing logic for dealing with these modifiers (`__specialized_for_target` and `__target_intrinsic`) was almost entirely string-based. We would turn the chosen compilation target into a string, and then use that to try and search for the "best" definition of a function at a few steps: * During IR linking, we always pick one definition of an `[import]`ed function, and that definition will be the one with the "best" target-specialization modifier (if any) * During final code generation, we always look up the "best" target-intrinsic modifier, and use it as the template for the code we output. This change preserves the basic flow there, but replaces the ad hoc string-based logic with something a bit more principled, in terms of a new `CapabilitySet` type. A `CapabilitySet` represents a set of zero or more atomic features (here represented as `CapabilityAtom`s). What a `CapabilitySet` means depends on how and where it is used: * A compilation target implies a `CapabilitySet` where the contents of the set are the features the target supports. * A `CapabilitySet` attached to a declaration (or a modifier on that declaration) describes a set of feature that declaration requires. The current implementation of `CapabilitySet` is wasteful and inefficient, but that is something we can iterate on over time. In practice, most of the current code only ever uses capability sets that are either empty (because they represent a function with no specific requirements) or singleton (because they represent asingle atomic capability like "is a GLSL target," "is an HLSL target," etc.). The main goal here was to put in the skeleton of a new system, including some of the features it might need down the line, and then to leave changes that eventually use the greater flexibility for later. Eventually, the capability system should encompass: * Differences between shader model versions, GLSL versions, SPIR-V versions, etc. (currently tracked with other modifiers) * Optional extensions, and functions that are made available only with certain extensions (currently tracked with other modifiers) * Front-end checking that the call graph of a program doesn't violate any capability-requirements (e.g., having a GLSL+HLSL portable function call a GLSL-only subroutine) * Hypothetically we can also try to fold stage-specific (vertex-only, fragment-only, etc.) functions into this system, but doing so would require more linker cleverness if we allow overloading on stages (since we might have to clone a caller if it calls through to a callee with multiple stage-specific versions) One important complication that the system has to deal with just because of the "do what I mean" nature of the current compiler is that somethings a current Slang user might compile for target X and specify version N, but then use a function that actually requires version N+1 of that target. Currently the Slang compiler silently "upgrades" the version(s) used by user code in these cases, because it is often what users want in cross-compilation scenarios. Dealing with the "silent upgrade" situation requires us to be a little careful and sometimes pick a "best" capability set that doesn't appear to be supported on our target. Refining that system and potentially getting rid of the "do what I mean" behavior over time could be a goal for future changes. * fixup: handle case where value is incompatible during linking	11 December 2020, 16:50:43 UTC
4337338	jsmall-nvidia	10 December 2020, 19:04:29 UTC	Building with embedded stdlib (#1634) * #include an absolute path didn't work - because paths were taken to always be relative. * Move reflection to reflection-api. * Slight reorg to pull out potentially Slang internal functions from the reflection API impls. * Remove visual studio projects * Fix for slang-binaries copy. * Add the visual studio projects in build/visual-studio * Remove miniz project. * Differentiate the linePath from the filePath. * Improve comment in premake5.lua + to kick of CI. * Kick CI. * Use COM compile request for calls to functions inside api-less-slang. Add static-slang project. * Fix const typo issue. * Don't include 'core' link in 'api-less-slang' * Removed static-slang lib causes problems on linux with linking. Embed Slang stdlib Added StaticBlob Added dumpSourceBytes Use ConstArrayView for the archive. At startup allow loading of zip with stdlib. Made -save-stdlib -load-stdlib take a name Added '-save-stdlib-bin-source' to save out serialized stdlib as source. * Ability enable/disable stdlib embedding. * Fix problem with moduleDecl not having module pointer set when serialized in. * Set of debugdir for slang-test and examples. * Add slang-stdlib-api.cpp * Update slang filters for VS. * Try to use pic, and -mcmodel=medium * Some more efforts ot make premake work. * WIP premake5.lua from previously working version. * Remove api-less-slang project. * Disable dllexport on gcc/clang. * Embed via slangc-bootstrap. * Fix slang-profile. Always compiles without stdlib. * Use pic "On" * Remove slangc-bootstrap and embed-stdlib-generator if embedding not required. Make bootstrap run the generators. * Improve comments in premake5.lua. Kick off another CI build. * Remove generation of stdlib source from std-lib-serialize.slang	10 December 2020, 19:04:29 UTC
e4a8251	Yong He	10 December 2020, 17:43:09 UTC	Move ShaderObject to be under renderer interface. (#1633) * Move ShaderObject to be under renderer interface. * Make `createPipelineState` take `const PipelineStateDesc&`. Move ShaderCursor implementation to a cpp file	10 December 2020, 17:43:09 UTC
b8e1f62	Tim Foley	08 December 2020, 02:18:31 UTC	Fix a subtle bug introduced into type legalization (#1632) The refactor of type legalization in PR #1594 introduced a subtle problem where an IR instruction might be removed from the hierachy (perhaps because its parent was removed during legalization) but would still be on the work list. Legalization of such instructions is wasteful (since it would never impact the output), but it also creates a problem if we try to insert new legalized instructions next to such a removed instruction. The logic for inserting an instruction before/after another asserts that the sibling instruction must have a parent, and leads to a failure in debug builds and a potential crash in release builds. This change adds a bit of defensive code to skip any instructions that appear to have been removed from the hierarchy (because they have no parent and are not the root/module instruction). An alternative approach would be to try to detect these instructions at the point where they would be added to the work list, but this approach seems simpler and more general.	08 December 2020, 02:18:31 UTC
0404fef	Tim Foley	07 December 2020, 22:17:17 UTC	Fix mistake where some public API functions weren't extern "C" (#1631) This was a serious problem that meant that some of our public API functions (exported from the DLL) had mangled C++ symbol names (which are not guaranteed to be portable across different compilers). The fix here is the expedient one: just add `extern "C"` to the relevant functions. The simplicity has a cost, though, in that this change introduces a significant break in binary compatibility. Source code should be compatible before/after this change, but it would be necessary to match an application and Slang shared library so that they agree on the mangled names of these symbols. We will need to treat this as a breaking release when this change goes out. Note that because of the way that C++ mangled names were being used, we already have introduced breaking changes to the binary interface in a recent change that made `SlangCompileRequest` into a `typedef` instead of a forward-declared `struct` type. To be clear, the breakage was due to the missing `extern "C"` and not the content of that change; the change would have been binary-compatible if the symbol names were correct. Given that the cat is out of the bag to some extent (our next release is going to break compatibility one way or another), it seems best to just get the whole thing sorted at once.	07 December 2020, 22:17:17 UTC
d7ce74a	Tim Foley	07 December 2020, 17:29:37 UTC	"Shader Toy" example and related fixes (#1629) * "Shader Toy" example and related fixes This change introduces a new `shader-toy` example program that is primarily designed to show how Slang's features for type-based encapsulation and modularity can be applied to modularity for effects along the lines of those from `shadertoy.com`. The Example ----------- The example is being checked in with an example "toy" effect that I hastily put together, so that it would not be encumbered with any IP concerns. I wrote the effect using the shadertoy.com editor, so I can be sure it is valid GLSL. During bringup of the application I used a pre-existing and larger effect for testing, so some of the support code that was added is not being used at present. The big-picture idea here is to have an exmaple that shows how to modularize things using Slang interfaces and generics, and then to use the Slang compiler API to manage the compilation, composition, specialization, and linking steps. For better or worse this leads to the sequence of API calls involved being much longer than what was in something like the `hello-world` example. Future Work (Example) --------------------- There is a lot of room for improvement and expansion here, so this should be viewed as a checkpoint of work in progress rather than something I'm claiming as a finalized demonstration of all we'd like to achieve. Areas for future work include: * We need to copy the integration of "Dear, IMGUI" that was already done for the `model-viewer` example so that this example can have a UI. * Now that the compilation flow is broken into all these additional steps, it should be possible to have the application load multiple effects as distinct modules, and then provide a UI for switching between them. The chosen effect module would be used to specialize the top-level shader(s) before kernel generation. * The checked-in logic includes a compute shader that can execute an effect, but that hasn't been tested nor has it been wired up to any kind of UI. We should have a way to switch between multiple execution methods, with a goal of eventually including CPU execution. * The "GLSL compatibility" code needs a lot of improvements before it is likely to be usable for a nontrivial number of shaders. Some of that work is waiting on Slang compiler fixes, though. * We should consider allowing the individual "toy" effects to define their own uniform parameters and expose those via a UI and reflection. The catch in this case is not that this would be difficult to do, but that it would be a semantic change to how shader toy effects currently work. The Compiler Fixes ------------------ Doing this work exposed a few bugs in Slang, and this change includes fixes for the ones that were quick to address. We already had logic in `slang-check-shader.cpp` that was validating the entry points in a compile request - either by checking the explicitly-listed entry points, or by scanning for `[shader("...")]` attributes. The problem is that the routine that did that checking was not being invoked on all compiles. The logic that handled entry points was only being run for manual compiles using `SlangCompileRequest`, while anything using `import` or `loadModule` would ignore entry points. I refactored the relevant code into a subroutine that will be invoked in all compilation scenarios. There were already `TODO` comments in `SpecializedComponentType` which made the point about how a specialized entry point like `myShader<YourType>` would need to properly show that it has dependencies on both the module that defines `myShader` and the module that defines `YourType`, while only the former was being handled at present. I went ahead and implemented the logic to scan the generic arguments for a specialized compoment type in order to determine what module(s) the arguments depend on (both type arguments and witness tables). With that change, using `IComponentType::link` on a specialized component will properly pull in the module(s) that the generic arguments come from. In `slang-ir-legalize-types.cpp` we could run into assertion failures in debug builds because of code trying to legalize layout `IRAttr`s for fields or parameters with types that need legalization. In practice it is safe to skip these layout attributes, because legalization of the fields/parameters they pertain to would result in creation of entirely new layout attributes, and the old ones would then be unreferenced. Future Work (Fixes) ------------------- There are other compiler bugs that this work exposed, but which this change does not address. These will need to be resolved as part of subsequent changes: * Slang allows for default-initialization of variables of a generic type. That is, given `<T : ISomething>` a user is allowed to declare `T x = {};` and the Slang front-end does not complain. Instead, this leads to an internal compiler error during IR lowering. * The Slang `__init()` feature probably needs to be upgraded to a properly supported feature, and we probably need a way to make implementing default-initialization an easy thing (e.g., any `struct` type that has initial-value expressions for all its fields should automatically and implicitly satsify an `init();` requirement declared in an interface) * Iniside an `__init()` definition, code has mutable access to members of the enclosing type, but for some reason the front-end is incorrectly treating `this` as immutable in those contexts. As a result you can write to `someField` but not `this.someField`. * User-defined operator overloads flat out don't work (which isn't surprising given that no clients have decided to use them yet, and we have no test coverage for them). This is actually due to the shadowing rules being used for lookup right now, so a fix for this issue is going to have far-reaching consequences around what overloads are visible where (and anything that impacts overload resolution is a big can of worms, including around performance). * fixup: test case had missing main function	07 December 2020, 17:29:37 UTC
e98c32f	Yong He	04 December 2020, 18:44:03 UTC	add windows release script (#1627) Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	04 December 2020, 18:44:03 UTC
47ed0f6	jsmall-nvidia	04 December 2020, 18:03:29 UTC	Projects in 'build' and Slang API separation (#1624) * #include an absolute path didn't work - because paths were taken to always be relative. * Move reflection to reflection-api. * Slight reorg to pull out potentially Slang internal functions from the reflection API impls. * Remove visual studio projects * Fix for slang-binaries copy. * Add the visual studio projects in build/visual-studio * Remove miniz project. * Differentiate the linePath from the filePath. * Improve comment in premake5.lua + to kick of CI. * Kick CI.	04 December 2020, 18:03:29 UTC
277780a	Yong He	03 December 2020, 22:48:42 UTC	Add github action to verify vs project file consistency. (#1625) * Add github action to verify vs project file consistency. * fix solution files * fix project files	03 December 2020, 22:48:42 UTC
a827798	jsmall-nvidia	03 December 2020, 17:55:29 UTC	Added miniz Visual Studio Project (#1623)	03 December 2020, 17:55:29 UTC
44c0a56	Yong He	03 December 2020, 16:23:05 UTC	Add shader object parameter binding to renderer_test. (#1622) * Add shader object parameter binding to renderer_test. * remove multiple-definitions.hlsl * Fix cuda implementation. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	03 December 2020, 16:23:05 UTC
ad5dda9	Tim Foley	02 December 2020, 22:12:51 UTC	Fix [mutating] generic methods (#1618) Slang generates code that turns the implicit `this` parameter of a method into an explicit parameter. The logic that decides whether that parameter should be `inout` is a bit involved, and there was a bug where a generic method would lead to the use of an `in` modifier (the default) and override the `inout` modifier that was requested by the method itself. This change fixes the logic to treat generic declarations in the parent chain of a leaf method as having no bearing on whether an implicit `this` parameter should be `inout` or not. A test case is included that breaks with the old behavior, and demonstrates that a generic `[mutating]` method can now work correctly.	02 December 2020, 22:12:51 UTC
ae222bf	jsmall-nvidia	02 December 2020, 16:29:38 UTC	Zip FileSystem support (#1617) * #include an absolute path didn't work - because paths were taken to always be relative. * Add miniz * Fix for separator in CacheFileSystem. Add compression unit test for zip. * Put zip compression into core. * Remove delimiter stripping if simplifying a path - as stripping will fix delimiters. * ZipFileSystem WIP. * More ZipFileSystem working. * Added isEmpty. Fixed small bug is contains. * First pass support for mutability on zip. * Improvements to File::read/writeAllBytes * Can access and save archive - but has memory leaks. * Fix memory leak. * Some ZIP compression tests. * Fix memory leak on ScopedAllocation. Fix off by one bug on UIntSet * Bug fix in UIntSet * Fix remaining ZipFileSystem issues. Adde stand alone unit-test. * Turn tabs to spaces in slang-io.h * Renamed mode ReadWrite (instead of just Write) * Make miniz it's own project. * Fix windows warning on win32. * Remove warnings needed when miniz was included as a header library. * Set the C++ standard via 'flags' in premake. * Add support for 'implicit' paths. * Add testing for implicit directories. Better handling of implicit directories. * Improve comments in ZipFileSystem. * Update comment around reader/writer transformation.	02 December 2020, 16:29:38 UTC
e631a25	jsmall-nvidia	02 December 2020, 13:45:08 UTC	Fix for TC problem with Path unit test (#1621) * #include an absolute path didn't work - because paths were taken to always be relative. * Hopefully fix for TC issue where canonical path causes problems - perhaps because on test machines visibility of paths outside the build environment is limited.	02 December 2020, 13:45:08 UTC
200e236	jsmall-nvidia	01 December 2020, 22:29:01 UTC	Make SlangCompileRequest COM type (#1620) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP for COM CompileRequest. * Add more methods to IGlobalSession. * Fix createCompileRequest. Made slangc tool use COM style methods. * m_ prefix variables in EndToEndCompileRequest	01 December 2020, 22:29:01 UTC
339422d	Yong He	30 November 2020, 16:59:34 UTC	Enable all dynamic-dispatch tests on D3D/VK. (#1615)	30 November 2020, 16:59:34 UTC
de8dbdd	Yong He	30 November 2020, 16:59:02 UTC	Re-enable `interface-shader-param` tests. (#1614)	30 November 2020, 16:59:02 UTC
c0fab43	jsmall-nvidia	20 November 2020, 22:24:35 UTC	Bug fixes: Memory leak/off by one on UIntSet (#1616) * #include an absolute path didn't work - because paths were taken to always be relative. * Fix memory leak on ScopedAllocation. Fix off by one bug on UIntSet	20 November 2020, 22:24:35 UTC
ee5842a	Yong He	20 November 2020, 16:34:14 UTC	Make witness and RTTI handles lower to `uint2`. (#1613) * Make witness and RTTI handles lower to `uint2`. And enable some dynamic dispatch tests on D3D/VK. * Bug fixes.	20 November 2020, 16:34:14 UTC
4459d44	Tim Foley	19 November 2020, 09:26:43 UTC	Unify handling of static and dynamic dispatch for interfaces (#1612) Overview ======== Prior to this change, we had two different code generation strategies for interface/existential types in Slang, that didn't always play nicely together: * The "legacy" static specialization approach could handle plugging in an arbitrary concrete type for an existential type parameter (including types with resources, etc.), but wouldn't work well with things like a `StructuredBuffer<>` of an interface type, and requires somewhat counter-intuitive layout rules to make work. * The new dynamic dispatch approach produces simpler, more easily understood layouts by assuming that values of interface type can fit into a fixed number of bytes. The tradeoff there is that it cannot handle types that include resources (only POD types). The goal of this change is to make it so that the two strategies can co-exist. In particular, in cases where a shader is amenable to both static specialization and dynamic dispatch, the type layouts should agree. In order to make the type layouts agree, we: * Declare that all values of existential type reserve storage according to the dynamic-dispatch rules (so 16 bytes for the RTTI and witness-table information, plus whatever bytes are needed to story "any value" of a conforming type). * Then we modify the "legacy" layout rules so that if a value of concrete type can fit in the reserved "any value" space for a given interface, then it is laid out there exactly like the dynamic dispatch rules would do. Otherwise, we fall back to the previous legacy rules (since we don't need to agree with the dynamic-dispatch layout on types that can't be used with dynamic dispatch). Details ======= * Renamed `ExistentialBox` to `BoundInterfaceType` to better clarify how it relates to `BindExistentialsType` * Unconditionally apply the `lowerGenerics` pass during emit, since it is now responsible for aspects of the lowering of existential types when specialization is used. * Made IR type layout take the target into account, so that the layout of resource types can vary by target (e.g., being POD on some targets, and invalid on others) * Cleaned up some issues around using global shader parameters as the "key" for their layout information in the global-scope layout (only comes up when there are global-scope `uniform` parameters) * Made there be a default any-value size (16) instead of making it be an error to leave out. This was the simplest option; we could try to go back to having an error, but we'd need to only issue it if we are sure a type/interface is being used with dynamic dispatch, since static dispatch doesn't have to obey the restrictions. * Changed lowering of existential types to tuples so that bound interfaces where the concrete type won't fit use a "pseudo-pointer" instead of an "any-value" to hold the payload * Changed IR type legalization to handle the "pseudo-pointer" case and apply layout information from an interface type over to the payload part when static specialization was used. * Changed some details of how witness tables were being lowered, so that we didn't have to create "proxy" witness tables for the constraints on associated types (just use the actual requirement entries we generate) * Changed witness tables so that they know the subtype doing the conforming * Added logic so that we don't generate pack/unpack logic and witness table wrapper functions for types that are incompatible with any-value/dynamic dispatch for a given interface. * Changed the core AST-level type layout logic to use the dynamic-dispatch layout in case things fit, and the legacy static specialization case when things don't (while also reserving space for the dynamic-dispatch fields) * Changed a bunch of test cases for static specialization to properly use the new layout (which introduces new buffers in some cases, and moves data around in others). Future Work =========== The experience of trying to reconcile our older way of handling interface-type specialization with our newer model (that supports dynamic dispatch) makes it clear that we really need to make similar changes to our handling of generic type parameters on entry points and at the global scope. A future change should make it so that a global type parameter is lowered with a type layout similar to a value parameter of interface type, including the RTTI and witness-table pieces, and just leaving out the "any value" piece. A similar translation strategy should apply to entry-point generic parameters (mirroring how we lower generic functions for dynamic dispatch already), and value specialization parameters. Co-authored-by: Yong He <yonghe@outlook.com>	19 November 2020, 09:26:43 UTC
b594510	jsmall-nvidia	19 November 2020, 09:14:48 UTC	File system refactor (#1611) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP FileSystem refactor. * Made loadFile load the file in binary mode. * Fixed some comments. Fixed typo in RelativePath - not used 'fixedPath'.	19 November 2020, 09:14:48 UTC
ac41b99	Yong He	19 November 2020, 09:14:34 UTC	Fix constant folding in attributes (#1610) * Fix constant folding in attributes * remove unnecessary change * remove unnecessary change * remove unnecessary change * Fixed circular checking issue. * cleanup * more cleanup * minimize diff * minimize diff * minimize diff	19 November 2020, 09:14:34 UTC
e140c49	jsmall-nvidia	18 November 2020, 23:12:43 UTC	Test for serializing out and reading back Stdlib (#1605) * #include an absolute path didn't work - because paths were taken to always be relative. * Mangling/module name extraction for GenericDecl * Add comment on SerialFilter to explain re-enabling Stmt. * Support setting up SyntaxDecl when reconstructed after deserialization. * Improvements to setup SyntaxDecl. * Fix typo so can read compressed SourceLocs. * Fix issue with SourceManger. * Simple test for serializing out stdlib and reading back in. * Fix calling convention. * Add override to StdLib impls. * Fix typo. * Apply testing to an actual compute test when using load-stdlib Make -load/compile-stdlib processable by Slang Move out testing into util into TestToolUtil so can be shared. * Slightly more concise setup of session. * Fix some errors introduced with session handling. * Made setup for compile same across slangc and slangc-tool.	18 November 2020, 23:12:43 UTC
d898d56	jsmall-nvidia	18 November 2020, 19:52:58 UTC	Serialized stdlib working (#1603) * #include an absolute path didn't work - because paths were taken to always be relative. * Mangling/module name extraction for GenericDecl * Add comment on SerialFilter to explain re-enabling Stmt. * Support setting up SyntaxDecl when reconstructed after deserialization. * Improvements to setup SyntaxDecl. * Fix typo so can read compressed SourceLocs. * Fix issue with SourceManger.	18 November 2020, 19:52:58 UTC
bdc589b	Yong He	17 November 2020, 20:03:45 UTC	Switch CI to github actions. (#1609) * Remove travis config files * change github build script * skip non-tag build on appveyor	17 November 2020, 20:03:45 UTC
7dd0ff9	Yong He	17 November 2020, 16:57:13 UTC	Integrate github actions for linux deployment. (#1607)	17 November 2020, 16:57:13 UTC
39709fb	Yong He	17 November 2020, 16:56:33 UTC	Integrate github actions for build+test on Windows. (#1606)	17 November 2020, 16:56:33 UTC
f4dbe7d	jsmall-nvidia	17 November 2020, 14:49:03 UTC	Fix premake5.lua for profile (#1604) * #include an absolute path didn't work - because paths were taken to always be relative. * The Profile project wasn't including the generated prelude.	17 November 2020, 14:49:03 UTC
46e1901	Yong He	16 November 2020, 19:18:55 UTC	Fix VS2017 Warnings (#1602) * Fix VS2017 Warnings * Update slang-visitor.h Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>	16 November 2020, 19:18:55 UTC
2c893d3	Yong He	11 November 2020, 20:33:32 UTC	Integrate github action for linux build+test. (#1601)	11 November 2020, 20:33:32 UTC
8f0895e	jsmall-nvidia	11 November 2020, 14:56:50 UTC	Include hierarchy output (#1595) * #include an absolute path didn't work - because paths were taken to always be relative. * Improve diagnostic for token pasting. * Token paste location test. * Output include hierarchy. * WIP on includes hierarchy. * Improved include hierarchy output - to handle source files without tokens. Improved test case. * Small comment improvements. Fixed a typo with not returning a reference. * Slight simplification of the ViewInitiatingHierarchy, by adding GetOrAddValue to Dictionary. * Remove the need for ViewInitiatingHierarchy type. * Improve output of path in diagnostic for includes hierarchy. * Remove comment in diagnostic for token-paste-location.slang * Update command line docs to include `-output-includes` Co-authored-by: Yong He <yonghe@outlook.com>	11 November 2020, 14:56:50 UTC
7bcc2b1	Yong He	10 November 2020, 22:55:36 UTC	Use integer RTTI/witness handles in existential tuples. (#1598) * Use integer RTTI/witness handles in existential tuples. * Fix clang error. * Fix IR serialization to use 16bits for opcode. * Undo accidental comment change. * Use variable length encoding for opcode. * Fix compile error. * Fixing issues * Fix code review issues.	10 November 2020, 22:55:36 UTC
1c4d768	Yong He	10 November 2020, 21:07:42 UTC	Fix IR serialization to use variable length encoding for opcode. (#1599) * Fix IR serialization to use 16bits for opcode. * Undo accidental comment change. * Use variable length encoding for opcode. * Fixing issues	10 November 2020, 21:07:42 UTC
c1e0a9d	Yong He	06 November 2020, 18:31:15 UTC	Fix comments. "white-list" -> "allow-list". (#1597)	06 November 2020, 18:31:15 UTC
444ff4d	Yong He	06 November 2020, 18:26:27 UTC	Specialize witness table lookups. (#1596) * Specialize witness table lookups. * Remove generated files from vcxproj * Fix call to generic interface methods.	06 November 2020, 18:26:27 UTC
94861d5	Tim Foley	06 November 2020, 17:12:14 UTC	Set theme jekyll-theme-tactile	06 November 2020, 17:12:14 UTC
0f6765b	Tim Foley	06 November 2020, 00:11:56 UTC	Refactor the flow of type legalization (#1594) The existing type legalization logic worked as a single preorder pass over the IR tree. This could create problems in cases where an instruction might be processed before one of its operands (e.g., a function that references a global shader parameter is processed before that parameter). This change makes it so that type legalization uses a work list, and only adds instructions to the work list once their parent, type, and operands have been processed. As a result, we should be able to guarantee that an instruction will only be processed once all of its operands have been. One wrinkle here is that in the current IR it is possible to end up with a cycle of uses for global-scope instructions, specifically around interface types and their list of requirements. This change includes a short-term kludge to break those cycles and allow the pass to complete. As it stands, this is simply a refactoring pass and no new functionality is introduced. The changes are necessary to unblock work in a feature branch that depends on type legalization being more robust against IR that might use an unexpected ordering.	06 November 2020, 00:11:56 UTC
c985f5f	jsmall-nvidia	05 November 2020, 18:43:00 UTC	Standard library save/loadable (#1592) * #include an absolute path didn't work - because paths were taken to always be relative. * Fix handling of access modifiers inside type definition. * Fix access problem for AST node. Make dumping produce a single function with switch, to potentially make available without Dump specific access. * WIP on serialization design doc. * Remove project references to previously generated files. * More docs on serialization design. * Improve serialization documentation. Remove unused function from IRSerialReader. * Small fixes around naming. Remove long comment from slang-serialize.h - as covered in serialization.md * Remove long comment in slang-serialize.h as covered in serialization.md * More information about doing replacements on read for AST and problems surrounding. * Typo fix. * Spelling fixes. * Value serialize. * Value types with inheritence. * Use value reflection serial conversion for more AST types * Use automatic serialization on more of AST. * Get the types via decltype, simplifies what the extractor has to do. * Update the serialization.md for the value serialization. * Small doc improvements. * Update project. * Remove ImportExternalDecl type Added addImportSymbol and ImportSymbol type Fixed bug in container which meant it wouldn't read back AST module * Because of change of how imports and handled, store objects as SerialPointers. * First pass symbol lookup from mangled names. * Cache current module looked up from mangled name. * Fix SourceLoc bug. Improve comments. * Added diagnostic on mangled symbol not being found * Fix typo. * WIP serializing stdlib. * WIP serializing stdlib in. * Fix problem serializing arrays that hold data that is already serialized. * Remove clash of names in MagicTypeModifier. * Make conversion from char to String explicit. Fix reference count issue with SerialReader. * Add code to save/load stdlib. * Use return code to avoid warning - SerialContainerUtil::write(module, options, &stream)) * Make all String numeric ctors explicit. Added isChar to UnownedStringSlice. Added operator== for UnownedStringSlice to String to avoid need to convert to String and allocate. * Add error check to readAllText. * tabs -> spaces on String.h * tab -> spaces String.cpp * Remove msg for StringBuilder, just build inplace for exceptions. * Check SerialClasses - for name clashes. Renamed Modifier::name as Modifier::keywordName * Handling of extensions when deserializing AST - updating the moduleDecl->mapTypeToCandidateExtensions Co-authored-by: Tim Foley <tim.foley.is@gmail.com>	05 November 2020, 18:43:00 UTC
8d4c0ea	Tim Foley	05 November 2020, 02:40:57 UTC	Improve insertion location for "hoistable" instructions (#1593) The Slang IR builder has a notion of "hoistable" instructions, which are basically those instructions that represent a pure side-effect-free operation on their operands, and which can and should be deduplicated. Most types are "hoistable" instructions. In order to make deduplication of hoistable instructions work, we need to emit them at the right location. Consider if we had: ```hlsl void myFunc<T>(...) { if(someCondition) { vector<T, 4> a = ...; ... } else { vector<T, 4> b = ...; } } ``` The IR instruction that represents `vector<T,4>` can't be inserted at the global scope, because then the parameter `T` would not be visible to it. That instruction also shouldn't be inserted into the same block that declares `a`, because then the instruction itself wouldn't be visible at the point where `b` is declared. The IR builder already has logic to pick the right parent instruction. In the example given, the IR instruction for `vector<T,4>` should be inserted into the body of the IR generic, but outside of the IR function that represents `myFunc`. The problem this change fixes is that while the logic was picking the parent for a hoistable instruction correctly, it wasn't putting much care into pick the insertion location. The existing strategy amounted to: * If the IR builder was set with an insertion location inside the chosen parent, then use that insertion location * Otherwise, insert at the end of the chosen parent Neither of those options is perfect. Either could lead to an instruction being inserted after one of its uses, and the second option could even lead to a type being inserted after the `return` instruction in a function/generic, which violates another structural invariant of our IR (that every block must end with a terminator, and terminators must only appear at the end of blocks). This change updates the rules as follows: * If the type of the instruction being created, or any of its operands are in the chosen parent, then insert immediately after whichever of those instructions is last in that parent. * Otherwise, insert before the first non-decoration, non-parameter child of the chosen parent The combined effect of these two rules is now that we insert any hoistable instruction as early as we can in its parent, without violating the structural validity rules. (One small exception to these rules is that if the parent is the module then we don't worry about ordering and just insert at the end, since order-of-declaration isn't significant at module scope in our IR) All of our existing tests work with this new behavior, although there could conceivably be future cases that lead to complicated breakage. For example, if a pass looks at the first "ordinary" instruction in a block and saves it to use as an insertion point for parameter, and then proceeds to manipulate code in the block before going back and inserting parameters at the chosen location, there is a chance that a hoistable instruction might have been inserted before the chosen insertion point, leading to a parameter being inserted after an ordinary instruction. In general, though, code that works like that would already be playing a dangerous game in that it is manipulating instructions in a block while assuming the first instruction will remain fixed. This change is currently just a refactor, but the underlying issue surfaced as a bug when I made other changes in a feature branch.	05 November 2020, 02:40:57 UTC
0600716	Yong He	29 October 2020, 17:21:07 UTC	Generate `switch` based dynamic dispatch logic. (#1591) Co-authored-by: Tim Foley <tim.foley.is@gmail.com>	29 October 2020, 17:21:07 UTC
494e09a	jsmall-nvidia	29 October 2020, 15:45:56 UTC	Handling imported/exporting symbols from serialized modules (#1589) * #include an absolute path didn't work - because paths were taken to always be relative. * Fix handling of access modifiers inside type definition. * Fix access problem for AST node. Make dumping produce a single function with switch, to potentially make available without Dump specific access. * WIP on serialization design doc. * Remove project references to previously generated files. * More docs on serialization design. * Improve serialization documentation. Remove unused function from IRSerialReader. * Small fixes around naming. Remove long comment from slang-serialize.h - as covered in serialization.md * Remove long comment in slang-serialize.h as covered in serialization.md * More information about doing replacements on read for AST and problems surrounding. * Typo fix. * Spelling fixes. * Value serialize. * Value types with inheritence. * Use value reflection serial conversion for more AST types * Use automatic serialization on more of AST. * Get the types via decltype, simplifies what the extractor has to do. * Update the serialization.md for the value serialization. * Small doc improvements. * Update project. * Remove ImportExternalDecl type Added addImportSymbol and ImportSymbol type Fixed bug in container which meant it wouldn't read back AST module * Because of change of how imports and handled, store objects as SerialPointers. * First pass symbol lookup from mangled names. * Cache current module looked up from mangled name. * Fix SourceLoc bug. Improve comments. * Added diagnostic on mangled symbol not being found * Fix typo. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	29 October 2020, 15:45:56 UTC
1d7a7f2	Yong He	28 October 2020, 16:38:56 UTC	Add sequential ID cache in Linkage for witness tables and RTTI objects. (#1590)	28 October 2020, 16:38:56 UTC
13945a5	jsmall-nvidia	26 October 2020, 21:10:24 UTC	Value type serialization via C++ Extractor (#1588) * #include an absolute path didn't work - because paths were taken to always be relative. * Fix handling of access modifiers inside type definition. * Fix access problem for AST node. Make dumping produce a single function with switch, to potentially make available without Dump specific access. * WIP on serialization design doc. * Remove project references to previously generated files. * More docs on serialization design. * Improve serialization documentation. Remove unused function from IRSerialReader. * Small fixes around naming. Remove long comment from slang-serialize.h - as covered in serialization.md * Remove long comment in slang-serialize.h as covered in serialization.md * More information about doing replacements on read for AST and problems surrounding. * Typo fix. * Spelling fixes. * Value serialize. * Value types with inheritence. * Use value reflection serial conversion for more AST types * Use automatic serialization on more of AST. * Get the types via decltype, simplifies what the extractor has to do. * Update the serialization.md for the value serialization. * Small doc improvements. * Update project.	26 October 2020, 21:10:24 UTC
e702b70	jsmall-nvidia	23 October 2020, 20:39:18 UTC	Serialization design doc first pass (#1587) * #include an absolute path didn't work - because paths were taken to always be relative. * WIP on serialization design doc. * More docs on serialization design. * Improve serialization documentation. Remove unused function from IRSerialReader. * Small fixes around naming. Remove long comment from slang-serialize.h - as covered in serialization.md * Remove long comment in slang-serialize.h as covered in serialization.md * More information about doing replacements on read for AST and problems surrounding. * Typo fix. * Spelling fixes.	23 October 2020, 20:39:18 UTC
051b20c	jsmall-nvidia	23 October 2020, 19:07:10 UTC	C++ extractor fix for access modifiers (#1586) * #include an absolute path didn't work - because paths were taken to always be relative. * Fix handling of access modifiers inside type definition. * Fix access problem for AST node. Make dumping produce a single function with switch, to potentially make available without Dump specific access. * Remove project references to previously generated files.	23 October 2020, 19:07:10 UTC
6d1fe29	Yong He	23 October 2020, 06:44:11 UTC	Generate `if` based dispatch logic on GPU targets. (#1585)	23 October 2020, 06:44:11 UTC
10e1bae	jsmall-nvidia	22 October 2020, 12:46:12 UTC	Single pass C++ extraction (#1583) * #include an absolute path didn't work - because paths were taken to always be relative. * Added CharUtil. Added TypeSet to extractor. First pass at being able to specify all headers for multiple output headers. * Fix includes for new C++ extractor convension. Update premake5 to use new extractor mechanisms. * Small improvements around StringUtil. * Split out NameConventionUtil. * Use a 'convert' to convert between convention types. * Fix output of build message for C++ extractor. Improve NameConventionUtil interface. * Improve comments. * Fix warning on gcc. * Fix clang warning. * Fix some typos in NameConventionUtil. * Small fix to premake5.lua * Fix generated includes. * Remove m_reflectType as no longer applicable with TypeSet. * Fix .gitignore for slang-generated-* files. Added getConvention to determine convention from slice. Add versions of split and convert that infer the from convention * Fix typo in spliting camel. * LineWhitespace -> HorizontalWhitespace * Improve CharUtil comments.	22 October 2020, 12:46:12 UTC
c094366	Yong He	21 October 2020, 02:07:14 UTC	Bottleneck interface dispatch calls through a single function. (#1584)	21 October 2020, 02:07:14 UTC
624809a	jsmall-nvidia	20 October 2020, 13:44:48 UTC	Small improvement in AST serialization (#1582) * #include an absolute path didn't work - because paths were taken to always be relative. * Make AST serialization types, marker include _AST_. Ie SLANG_CLASS -> SLANG_AST_CLASS and SLANG_ABSTRACT_CLASS -> SLANG_ABSTRACT_AST_CLASS	20 October 2020, 13:44:48 UTC
9b25d4a	jsmall-nvidia	19 October 2020, 18:35:20 UTC	Fix saving Repro files on Linux (#1581) * #include an absolute path didn't work - because paths were taken to always be relative. * Ascii mode not always set on FileStream. Remove this-> if not needed. Simplify setting of m_fileAccess. * Fix typo. * Fix typo. * Clear up default FileAccess calculation. * Convert tabs to spaces. * Small naming improvements in FileStream::seek.	19 October 2020, 18:35:20 UTC
acf94e7	jsmall-nvidia	19 October 2020, 16:05:18 UTC	Hotfix: Crash due to ContainerDecl->members being altered whislt iterated over (#1580) * #include an absolute path didn't work - because paths were taken to always be relative. * Access the members iteration in _ensureAllDeclsRec via indices to avoid a change in the array invalidating the list. * Fix another iterator of members in SemanticVisitor * Slight improvements to comments - main purpose is to kick a new build.	19 October 2020, 16:05:18 UTC
d3e255b	Tim Foley	15 October 2020, 20:13:49 UTC	Fix a bug in IR lowering (#1578) The basic problem here is that when a function has multiple declarations with matching signatures (e.g., a forward declaration and then a later definition with a body), the IR lowering logic would lower all declarations whenever the first one was encountered, but then would only register an IR value as the lowered version of the first declaration. Other matching declarations would then run the risk of being lowered again, and in the case where they included features like loops with break/continue labels, that would create the risk of keys getting inserted into certain dictionaries more than one, leading to exceptions. This change ensures that when lowering a function that has multiple matching declarations to IR, we register an IR value for all of those declarations and not just the first. I have added a test case that leads to a crash without this change, to ensure that we don't introduce a regression down the line.	15 October 2020, 20:13:49 UTC
4149bf2	jsmall-nvidia	15 October 2020, 15:32:34 UTC	Fix Vk leak (#1579) * #include an absolute path didn't work - because paths were taken to always be relative. * Handle scope of VkShaderModule. * Fix tabbing issue.	15 October 2020, 15:32:34 UTC
4375ceb	Tim Foley	14 October 2020, 20:55:45 UTC	Add reflection API access to global params type layout (#1577) This change adds a single new entry point to our reflection API that allows an application to query the `TypeLayout` that represents the global-scope shader parameters. This can be used by the application in order to detect when the global parameters have required allocation of a default constant buffer, or simply to unify the handling of the global scope with handling of other kinds of parameters.	14 October 2020, 20:55:45 UTC
9cb3174	jsmall-nvidia	13 October 2020, 20:30:25 UTC	Repro test that loads repro (#1576) * #include an absolute path didn't work - because paths were taken to always be relative. * Slang repro test that reloads and runs compiled code.	13 October 2020, 20:30:25 UTC
fab1c9f	Yong He	09 October 2020, 18:29:11 UTC	Support CUDA bindless texture in dynamic dispatch code. (#1575)	09 October 2020, 18:29:11 UTC
11f3317	Yong He	09 October 2020, 16:55:32 UTC	Make RTTI objects __constant__ in CUDA (#1573) Co-authored-by: Yong He <yhe@nvidia.com>	09 October 2020, 16:55:32 UTC
66ab6f4	jsmall-nvidia	09 October 2020, 13:54:16 UTC	NVAPI support doc (#1574) * #include an absolute path didn't work - because paths were taken to always be relative. * Split out NVAPI documentation. Attempt to describe updated usage. * Discuss downstream compiler include paths issues. * Fix links . * Apparently github supports relative links... * Fix typo.	09 October 2020, 13:54:16 UTC
55cd421	Yong He	07 October 2020, 07:30:10 UTC	Fix C++ emit for `bit_cast` inst. (#1570) Co-authored-by: Yong He <yhe@nvidia.com>	07 October 2020, 07:30:10 UTC
4ad2e52	jsmall-nvidia	06 October 2020, 21:07:22 UTC	Use Reflection for (Serial)RefObject Serialization (#1567) * First pass at generalizing serializer. * Split out ReflectClassInfo * Use the general ReflectClassInfo * Fix some typos in debug generalized serialization. * Add calculation of classIds. Make distinct addCopy/add on SerialClasses. * Write up of more generalized serialization * WIP to transition from ASTSerialReader/Writer etc to generalized SerialReader/Writer and associated types. * Improvements to SerialExtraObjects. Keep RefObjects in scope in factory * Compiles with Serial refactor - doesn't quite work yet. * First pass serialization appears to work with refector. * Split out type info for general slang types. * Split out slang-serialize-misc-type-info.h * DebugSerialData -> SerialSourecLocData DebugSerialReader -> SerialSourceLocReader DebugSerialWriter -> SerialSourceLocWriter * Remove unused template that only compiles on VS. * Fix warning around unused function on non-VS. * Improve output of type names that are in scopes in C++ extractor. Update premake5.lua to run generation for RefObject derived types. * C++ extractor working on RefObject type. * Split out serialization functionality that spans different types into slang-serialization-factory.cpp/.h Put AST type info into header. Removed RefObjectSerialSubType - use RefObjectType Add filtering for RefObject derived types Remove construction and filteringhacks. * Set up field serialization for SerialRefObject derived types. * Fix template problem compiling on Clang/Gcc * Work in progress to make Value types work. * Added slang-value-reflect.cpp	06 October 2020, 21:07:22 UTC
8a70e20	jsmall-nvidia	06 October 2020, 17:30:55 UTC	InterlockedExchangeU64 support on RWByteAddressBuffer (#1572) * #include an absolute path didn't work - because paths were taken to always be relative. * Added [__requiresNVAPI] to functions that need nvapi support. * Added support for InterlockedExchangeU64 Added exchange-int64-byte-address-buffer test Fixed typo in cas-int64-byte-address-buffer test * Improve comment around NVAPI usage in hlsl.meta.slang	06 October 2020, 17:30:55 UTC
b6ad8df	jsmall-nvidia	06 October 2020, 13:47:12 UTC	Added [__requiresNVAPI] to functions missing it (#1571) * #include an absolute path didn't work - because paths were taken to always be relative. * Added [__requiresNVAPI] to functions that need nvapi support.	06 October 2020, 13:47:12 UTC
41d8610	Tim Foley	05 October 2020, 18:10:53 UTC	Small fixes for CUDA code emit (#1564) * Small fixes for CUDA code emit * Add a CUDA translation to `GroupMemoryBarrierWithWaveSync()`. We map this to `__syncwarp()` for CUDA (with no mask, implying a full-warp sync). * Consistently use `SLANG_PRELUDE_ASSERT` for assertions introduced in code emit (rather than just using the bare `assert(...)` function, which is not included by our CUDA prelude by default) * Add a new `SLANG_CUDA_STRUCTURED_BUFFER_NO_COUNT` flag to the CUDA prelude that allows the `count` field to be omitted from `(RW)StructuredBuffer<T>`. This is a bit of a hacky because the computed layouts will still assume the `count` field is present, but this feature is required by at least one client application for now. A better long-term fix will take more time to design and implement. * fixup: CUDA prelude code fix for pedantic compilers Co-authored-by: Tim Foley <tim.foley.is@gmail.com> Co-authored-by: Yong He <yonghe@outlook.com>	05 October 2020, 18:10:53 UTC
d930c65	Yong He	05 October 2020, 17:30:27 UTC	Update the type of a call inst during specialization. (#1569)	05 October 2020, 17:30:27 UTC
3321df7	Yong He	04 October 2020, 08:40:58 UTC	Handle partial existential parameter type specialization. (#1568) * Specialize exsitentials parameters in struct fields. * Cleanup. * Handle partial existential parameter type specialization. Co-authored-by: Yong He <yhe@nvidia.com>	04 October 2020, 08:40:58 UTC
24ecd1f	Yong He	02 October 2020, 19:53:29 UTC	Use new vulkan debug layer. (#1566) * Use new vulkan debug layer. * Try use VK_LAYER_KHRONOS_validation when it exists. Co-authored-by: Tim Foley <tim.foley.is@gmail.com>	02 October 2020, 19:53:29 UTC
aadf600	Yong He	02 October 2020, 16:49:18 UTC	Specialize exsitentials parameters in struct fields. (#1565) * Specialize exsitentials parameters in struct fields. * Cleanup. Co-authored-by: Yong He <yhe@nvidia.com>	02 October 2020, 16:49:18 UTC
274c20a	jsmall-nvidia	30 September 2020, 17:28:56 UTC	Generalizing Serialization (#1563) * First pass at generalizing serializer. * Split out ReflectClassInfo * Use the general ReflectClassInfo * Fix some typos in debug generalized serialization. * Add calculation of classIds. Make distinct addCopy/add on SerialClasses. * Write up of more generalized serialization * WIP to transition from ASTSerialReader/Writer etc to generalized SerialReader/Writer and associated types. * Improvements to SerialExtraObjects. Keep RefObjects in scope in factory * Compiles with Serial refactor - doesn't quite work yet. * First pass serialization appears to work with refector. * Split out type info for general slang types. * Split out slang-serialize-misc-type-info.h * DebugSerialData -> SerialSourecLocData DebugSerialReader -> SerialSourceLocReader DebugSerialWriter -> SerialSourceLocWriter * Remove unused template that only compiles on VS. * Fix warning around unused function on non-VS.	30 September 2020, 17:28:56 UTC
94d3f2b	Yong He	27 September 2020, 03:09:50 UTC	Add API for whole program compilation. (#1562) * Add API for whole program compilation. This change exposes a new target flag: `SLANG_TARGET_FLAG_GENERATE_WHOLE_PROGRAM` that can be set on a target with `spSetTargetFlags`. When this flag is set, `spCompile` function generates target code for the entire input module instead of just the specified entrypoints. The resulting code will include all the entrypoints defined in the input source. The resulting whole program code can be retrieved with two new functions: `spGetTargetCodeBlob` and `spGetTargetHostCallable`. This change also cleans up the unnecessary `entryPointIndices` parameter of `TargetProgram::getOrCreateWholeProgramResult`, and modifies the `cpu-hello-world` example to make use of the new whole-program compilation API to simplify its logic. * Update comments.	27 September 2020, 03:09:50 UTC
b72353e	Yong He	24 September 2020, 21:30:12 UTC	Enable default cpp prelude. (#1560) * Enable default cpp prelude. * Print the "#include" line as a normal source if the file does not exist. * Bug fix * Fix. * Fix c++ prelude header. * Remove unnecessary fopen call.	24 September 2020, 21:30:12 UTC
150218b	Tim Foley	24 September 2020, 20:09:40 UTC	Refactor preprocessor API to avoid coupling (#1559) Based on review feedback from #1556, this change updates the Slang preprocessor so that it is no longer coupled to policy details from higher levels of the software stack. In particular, the preprocessor used to: * Deal with updating the list of file paths that a `Module` depends on. * (As of #1556) detect NVAPI-related macro definitions and use them to construct an AST-level `Modifier` attached to the `ModuleDecl`. This change introduces a callback interface where the `Preprocessor` calls out to a `PreprocessorHandler` at certain points during execution, allowing the handler to introduce custom logic that suits a particular high-level use case. This change also removes the dependence of the preprocessor on the `Linkage`, because in practice only a small number of its sub-objects were needed. As a convenience, a wrapper function that takes a `Linkage` was left in place so that the existing call sites didn't have to change very much.	24 September 2020, 20:09:40 UTC
fd2ac53	Tim Foley	24 September 2020, 03:49:35 UTC	Fix GLSL output for byte-address loads of vectors (#1558) While working on #1557, it became clear that something was going wrong when using `*ByteAddressBuffer.Load<T>` to load a vector type on GLSL/SPIR-V targets. The root problem was that the IR-level layout logic (which computes the "natural" layout of a type) had not yet been extended to handle vectors. The fix is simple enough, but it highlights the fact that we probably need to go ahead and "complete" that layout logic sooner or later. This change includes a test case that covers the behavior added here, as well as the case that #1557 fixes. Unfortunately, due to CI system limitations, the HLSL/dxc part of the test is not yet enabled.	24 September 2020, 03:49:35 UTC
8954052	Tim Foley	23 September 2020, 22:47:14 UTC	Simplify workflow when using NVAPI (#1556) In some cases, functionality is available as either a GLSL extension for Vulkan/SPIR-V, or through the NVAPI system for D3D. This situation creates complications because while GLSL extensions are generally all supported by the open-source glslang compiler (which we can bundle and ship), NVAPI operations are exposed through a specific header (`nvHLSLExtns.h`) that ships as part of the NVAPI SDK. When a user wants to explicitly use NVAPI-provided operations in their shader code, there are no major complications for Slang; the user sets up their include paths, `#include`s the relevant header, calls functions in it, and lets Slang deal with the details of compilation. The challenge for Slang arises when we want to provide a cross-platform interface in our standard library (e.g., the `RWByteAddressBuffer.InterlockedAddF32` method that was recently added) that uses either a GLSL extension (when compiling for Vulkan/SPIR-V) or an NVAPI (when compiling to DXBC or DXIL). In that case, the code generated by Slang now has a dependency on NVAPI, and we need to somehow emit a `#include` directive that pulls it in when invoking fxc or dxc. Because we do not (and seemingly cannot) bundle the NVAPI header with the compiler, we have to rely on ther user to have it available and to somehow communicate to Slang where it is. Exposing portable routines that sometimes use NVAPI currently creates two main challenges: 1. The user is forced to interact with the "prelude" mechanism in the compiler, which allows the programmer to define code in a given target language that gets prepended to the Slang-generated code. While the prelude mechanism is powerful, it is also hard for users to integrate into their workflow, and our experience so far is that users want something that Just Works. 2. If the user writes code that uses some of our abstract operations that layer on NVAPI and they also want to use NVAPI explicitly, they end up with two copies of the NVAPI header (one included by the Slang front-end, and another included by the downstream fxc/dxc compiler). This puts the user in the situation of (a) having to ensure that they set the defines like `NV_SHADER_EXTN_SLOT` consistently both when invoking Slang and when adding their prelude, and (b) even if they do make the definitions consistent, they run into the problem that fxc/dxc complain about overlapping register bindings on the two copies of the `g_NvidiaExt` global shader paraemter that the NVAPI header declares. This change attempts to resolve both issues by adding a lot of "do what I mean" logic to the compiler to try to ease things in the common case. In particular: 1. The user no longer needs to use the "prelude" mechanism when using NVAPI. The compiler now embeds a default prelude for HLSL output, which will `#include` the NVAPI header if and only if the generated code needs NVAPI access because of portable standard library routines that were used. 2. The user can mix-and-match explicit NVAPI use and stdlib functions that compile to use NVAPI. The register/space to be used by NVAPI when included via prelude is now set based on whatever the user set via the preprocessor so that it should automatically be consistent between both cases. Furthermore, the code we emit for the declaration of `g_NvidiaExt` when compiling explicit NVAPI use is set up to be conditional, so that it is skipped in the case where the prelude will pull in its own declaration of that parameter. The way all this is achieved involves a lot of moving pieces: * We now have an HLSL prelude, which mostly just serves to `#include "nvHLSLExtns.h"` in the case where NVAPI support is needed downstream. * Standard library operations that require NVAPI for their implementation on HLSL include a new `[__requiresNVAPI]` attribute. * The preprocessor has been extended so that after tokenizing an input file it looks up the NVAPI-relevant macros in the resulting environment, and if they are set it attached a modifier (`NVAPISlotModifier1) to the AST `ModuleDecl` that is based on their values. Logic is added to detect if multiple input files specify values for the macros in ways that conflict. * The semantic checking step is extended so that it detects the "magic" NVAPI declarations (the `g_NvidiaExt` paramter and the `NvShaderExtnStruct` type that it uses) and attaches a modifier to them so that they can be identified as such in later steps. * Parameter binding is extended to collect a list of the AST modifiers that reflect NVAPI binding, and to reserve the relevant register(s) so that ordinary user-defined parameters cannot conflict with them. * IR lowering translates the three new AST modifiers related to NVAPI over to IR equivalents. * IR linking is extended to make sure that it clones any `IRNVAPISlotDecoration`s attached to the input modules. The pass intentionally does not care where the modifiers came from; it just collects them all and leaves it to downstream code to sort out what they mean. * Emit logic is extended to have a notion of "prelude directives" which are preprocessor directives that should come before the prelude in the generated code, because they can impact the way that the prelude compiles. This is done so that we don't have to introduce ad hoc logic for each downstream compiler to set any relevant `-D` flags (e.g., both fxc and dxc would need to duplicate such logic for NVAPI support). * The HLSL source emitter is extended to track whether it emits any operations that require NVAPI support. * The HLSL source emitter is extended to emit prelude directives based on whether NVAPI is needed and, if it is, to also set the register and space that NVAPI should use based on what was stored in the decoration(s) on the IR module. * The HLSL source emitter is extended so that it detects global instructions that represent "magic" NVAPI constructs , and emit them as conditional definitions so that they are skipped when NVAPI is included via the prelude. * The handling of requires capabilities during emit logic was cleaned up a bit so that more logic is shared across targets, and also so that the same logic is used both when emitting a function declaration/definition and when emitting a call to an instrinsic function (which won't get declared/defined).	23 September 2020, 22:47:14 UTC
3d063a7	Tim Foley	23 September 2020, 19:24:16 UTC	Fix a bug around byte-address buffer loads of vectors (#1557) This problem is only visible when * using `RWByteAddressBuffer.Load<V>`, * when `V` is `vector<T,N>`, * and `V` is a non-32-bit type In such a case, the Slang compiler generates output HLSL like: someBuffer.Load<vector<T,N>>(someOffset); and dxc balks because it fails to parse the `>>` as closing the generics, and instead parses it as a shift operator. The solution here is simple: add a space before the closing `>` when emitting a `.Load<T>` operation. Note that this change does not come with a test fix yet because writing the test case exposed a more complicated issue with GLSL codegen for this same scenario. This change includes the simple single-byte fix, to unblock users while we work on a fix for the GLSL case (and that fix will include the test coverage).	23 September 2020, 19:24:16 UTC
a5b0cde	jsmall-nvidia	21 September 2020, 19:45:35 UTC	Allow #include of absolute paths (#1555) * #include an absolute path didn't work - because paths were taken to always be relative. * Improve comments. * Small comment improvement.	21 September 2020, 19:45:35 UTC
83514bd	Yong He	21 September 2020, 15:27:10 UTC	Enable all dynamic dispatch tests on CUDA. (#1552) * Enable all dynamic dispatch tests on CUDA. * Fix expected cross-compile test results.	21 September 2020, 15:27:10 UTC
21339e8	jsmall-nvidia	18 September 2020, 17:35:45 UTC	Serialization fixes based on review of #1547 (#1551) * Test if blob is returned. * Rename serialize files so can be grouped. * StringRepresentationCache -> SerialStringTable * Split out SerialStringTable from slang-serialize-ir * First pass at reorganizing serialization/containers. Remain some issues about debug info. * Fix bug in calculating sourceloc. * Improve calcFixSourceLoc * Make allocations for payload RiffContainer align to at least 8 bytes. This is important for read, if the payload can contain 8 byte aligned data. Note this has no effect on Riff file format alignment rules. * Improve comments around RiffContainer and alignment. * Remove SerialStringTable, can just use StringSlicePool instead. * Add flags to control what is output in SerialContainer. Turn off AST output for obfuscated code. Lazily create astClasses when doing write container serialization. * Typo fix for Clang/Linux. * Fixes that came out of review * TranslationUnit -> Module * TargetModule -> TargetComponent * PAYLOAD_MIN_ALIGNMENT -> kPayloadMinAlignment	18 September 2020, 17:35:45 UTC
9a6eec6	jsmall-nvidia	18 September 2020, 15:02:06 UTC	Control container serialization with SerialOptionFlags (#1550) * Test if blob is returned. * Rename serialize files so can be grouped. * StringRepresentationCache -> SerialStringTable * Split out SerialStringTable from slang-serialize-ir * First pass at reorganizing serialization/containers. Remain some issues about debug info. * Fix bug in calculating sourceloc. * Improve calcFixSourceLoc * Make allocations for payload RiffContainer align to at least 8 bytes. This is important for read, if the payload can contain 8 byte aligned data. Note this has no effect on Riff file format alignment rules. * Improve comments around RiffContainer and alignment. * Remove SerialStringTable, can just use StringSlicePool instead. * Add flags to control what is output in SerialContainer. Turn off AST output for obfuscated code. Lazily create astClasses when doing write container serialization. * Typo fix for Clang/Linux.	18 September 2020, 15:02:06 UTC
2ddca33	Yong He	17 September 2020, 23:46:23 UTC	Initial attempt to enable CUDA dynamic dispatch codegen (#1549) * Front-load cuda module loading to fill in RTTI pointers. * Enable dynamic dispatch codegen for CUDA.	17 September 2020, 23:46:23 UTC
017acb3	Yong He	17 September 2020, 22:07:08 UTC	Front-load cuda module loading to fill in RTTI pointers. (#1548) Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	17 September 2020, 22:07:08 UTC
b9cddcb	jsmall-nvidia	17 September 2020, 20:47:57 UTC	Share debug information between AST and IR (#1547) * Test if blob is returned. * Rename serialize files so can be grouped. * StringRepresentationCache -> SerialStringTable * Split out SerialStringTable from slang-serialize-ir * First pass at reorganizing serialization/containers. Remain some issues about debug info. * Fix bug in calculating sourceloc. * Improve calcFixSourceLoc * Make allocations for payload RiffContainer align to at least 8 bytes. This is important for read, if the payload can contain 8 byte aligned data. Note this has no effect on Riff file format alignment rules. * Improve comments around RiffContainer and alignment. * Remove SerialStringTable, can just use StringSlicePool instead. * Typo fix for Clang/Linux. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	17 September 2020, 20:47:57 UTC
bbf492a	Tim Foley	17 September 2020, 19:13:50 UTC	Embed default prelude for CUDA (#1546) * Embed default prelude for CUDA Slang supports the notion of a "prelude" that gets prepended to the source code we generate in language. For some targets, a prelude is not necessary (e.g., we compile to HLSL/GLSL and then on to DXBC/DXIL/SPIR-V just fine without a prelude), but some targets have been implemented in a way that makes a prelude necessary (notably CPU and CUDA). For the targets that require a prelude, the Slang codebase includes usable preludes under the `prelude/` directory. Prior to this change, if a user was compiling for such a target (whether via command-line or API), there had to take responsibility for specifying the prelude to use (usually by passing in the contents of the prelude file(s) already included in the Slang distribution). It is reasonable for a user to expect an out-of-the-box experience where compilation to CUDA PTX or native CPU code should Just Work, similarly to how compilation to SPIR-V Just Works. This change is a step in the direction of providing a user experiene that Just Works for common cases. The main addition here is a tool called `slang-embed` that we run during our build to turn the `prelude/.h` files into `prelude/.h.cpp` files that embed the contents of the original `.h` file as a `const` variable. By compiling and linking in the generated `.h.cpp` file for the CUDA prelude, we are then able to set the default prelude to use for CUDA at the time a session/linkage is created. That default prelude will be used unless the user manually specifies their own prelude (which current users of the CUDA back-end must be doing). This change only sets up a default prelude for CUDA because of the way that the CPU prelude is split across multiple files. A strategy that provides a good default prelude for CPU may take more work, but that work might also be unnecessary if we switch to a strategy of using LLVM to generate native code. The implementation of the `slang-embed` tool is intentionally simple, and it will likely run into issues if/when we need to embed binary files or larger text files. The assumption being made here is that we can address those issues when they arise, and there is no reason to over-engineer the tool right now. The way that `slang-embed` is integrated into our build process is likely to require some iteration to make sure that it works across all platforms. I expect that this change will have multiple follow-up fixes related to trying to get the build to work as expected across all targets on CI. * fixup: trying to ensure that embedded prelude gets compiled into slang * fixup: properly clean up allocations in slang-embed * fixup: fix double free introduced by previous change * fixup: off-by-one allocation error	17 September 2020, 19:13:50 UTC

Newer
Older