sort by:
Revision Author Date Message Commit Date
eaafafe Update DXR API definitions for final spec. (#659) * Update DXR API definitions for final spec. The final version of the DXR API has changed the result type of the `DispatchRaysIndex()` and `DispatchRaysDimensions()` builtins to `uint3` (from `uint2`). * Add updates for DXR object<->world transformations The `ObjectToWorld()` and `WorldToObject()` functions were renamed to `ObjectToWorld3x4()` and `WorldToObject3x4()`, resepctively, and then new functions `ObjectToWorld4x3()` and `WorldToObject4x3()` were added to give convenient access to the transpose of these matrices. (No, I'm not clear on why user's couldn't just call `transpose()`, either) I've left the old function names in the standard library as forwarding functions just so that we don't break existing DXR code that relied on the old names. 03 October 2018, 23:03:37 UTC
cde0dec Feature/ir serial debug (#657) * * Change the layout of IROp such that 'main' IROps are 0-x. * Removed MANUAL_RANGE instuction types, as no longer needed. * Work in prog on optimizing. * * Constant time lookup for IROpInfo * Refactor and document a little more the IROp layout * Mark ops that use 'other' bits * Fix typo in definition of kIROpFlag_UseOther * First pass at working out serialization structure. * Work in progress on ir-serialize * Storing strings in IRSerialInfo Split out IRSerialInfo from the IRSerializer - to make more explicit what is actually saved. * First pass at serializing out data. * First pass at serialize reading. * Fix riff fourcc mark order. * First pass at reconstructing IRInst / IRDecoration from serialized data. * Handling of TextureBaseType * Deserializing of constants. * Small changes around ir serialization. * Changed StringIndex indexing to not be an offset into the m_strings array, but an index into strings in order. Doing so makes cache lookup much faster, and makes the 'indicies' themselves smaller and therefore more compressible. * Removed the need for m_arena in IRSerialWriter. Previously it's purpose was to store the string contents that were being used to lookup UnownedStringSlice. Now we keep the StringRepresentation in scope and reference that, and so don't need the copy. * Don't need to construct the IRModuleInst as is created and set on createModule call. * Remove test code for testing serialization. * Fix problem with release build in ir-serialize causing warning. * Use SLANG_OFFSET_OF for offsets in non pod classes to avoid gcc/clang warning. Give storage to integral static variables to avoid linkage problems with gcc/clang. * Fix warnings under x86 win32 debug. * Small improvements around IR serialization. * * Support for serializing SourceLoc. * Small improvements around serialization. * RawSourceLoc allows for regular SourceLoc information to be held (and serialized) as is. This is only really useful for the 'passthru' mode as there needs to be a more compact mechanism to encode source locations. * Small fixes around comments for SourceLoc serializing. 02 October 2018, 21:22:15 UTC
852ea40 Update README.md 28 September 2018, 17:17:20 UTC
648fc9b Feature/ir serialize improvements (#655) * * Change the layout of IROp such that 'main' IROps are 0-x. * Removed MANUAL_RANGE instuction types, as no longer needed. * Work in prog on optimizing. * * Constant time lookup for IROpInfo * Refactor and document a little more the IROp layout * Mark ops that use 'other' bits * Fix typo in definition of kIROpFlag_UseOther * First pass at working out serialization structure. * Work in progress on ir-serialize * Storing strings in IRSerialInfo Split out IRSerialInfo from the IRSerializer - to make more explicit what is actually saved. * First pass at serializing out data. * First pass at serialize reading. * Fix riff fourcc mark order. * First pass at reconstructing IRInst / IRDecoration from serialized data. * Handling of TextureBaseType * Deserializing of constants. * Small changes around ir serialization. * Changed StringIndex indexing to not be an offset into the m_strings array, but an index into strings in order. Doing so makes cache lookup much faster, and makes the 'indicies' themselves smaller and therefore more compressible. * Removed the need for m_arena in IRSerialWriter. Previously it's purpose was to store the string contents that were being used to lookup UnownedStringSlice. Now we keep the StringRepresentation in scope and reference that, and so don't need the copy. * Don't need to construct the IRModuleInst as is created and set on createModule call. * Remove test code for testing serialization. * Fix problem with release build in ir-serialize causing warning. * Use SLANG_OFFSET_OF for offsets in non pod classes to avoid gcc/clang warning. Give storage to integral static variables to avoid linkage problems with gcc/clang. * Fix warnings under x86 win32 debug. * Small improvements around IR serialization. 28 September 2018, 13:39:08 UTC
d06bd7d First pass implementation of IR serialization (#653) * * Change the layout of IROp such that 'main' IROps are 0-x. * Removed MANUAL_RANGE instuction types, as no longer needed. * Work in prog on optimizing. * * Constant time lookup for IROpInfo * Refactor and document a little more the IROp layout * Mark ops that use 'other' bits * Fix typo in definition of kIROpFlag_UseOther * First pass at working out serialization structure. * Work in progress on ir-serialize * Storing strings in IRSerialInfo Split out IRSerialInfo from the IRSerializer - to make more explicit what is actually saved. * First pass at serializing out data. * First pass at serialize reading. * Fix riff fourcc mark order. * First pass at reconstructing IRInst / IRDecoration from serialized data. * Handling of TextureBaseType * Deserializing of constants. * Small changes around ir serialization. * Changed StringIndex indexing to not be an offset into the m_strings array, but an index into strings in order. Doing so makes cache lookup much faster, and makes the 'indicies' themselves smaller and therefore more compressible. * Removed the need for m_arena in IRSerialWriter. Previously it's purpose was to store the string contents that were being used to lookup UnownedStringSlice. Now we keep the StringRepresentation in scope and reference that, and so don't need the copy. * Don't need to construct the IRModuleInst as is created and set on createModule call. * Remove test code for testing serialization. * Fix problem with release build in ir-serialize causing warning. * Use SLANG_OFFSET_OF for offsets in non pod classes to avoid gcc/clang warning. Give storage to integral static variables to avoid linkage problems with gcc/clang. * Fix warnings under x86 win32 debug. 27 September 2018, 15:36:35 UTC
ee54994 Improve IROp lookup (#650) * * Change the layout of IROp such that 'main' IROps are 0-x. * Removed MANUAL_RANGE instuction types, as no longer needed. * Work in prog on optimizing. * * Constant time lookup for IROpInfo * Refactor and document a little more the IROp layout * Mark ops that use 'other' bits * Fix typo in definition of kIROpFlag_UseOther 25 September 2018, 21:59:16 UTC
4f979d7 Fixes around atomic operations (#652) * Fixes around atomic operations Work on #651 The existing handling of atomic operations had a few issues: * The HLSL atomic functions (`Interlocked*`) didn't have mappings to GLSL * Atomic operations on images weren't supported at all because the subscript operation on `RWTexture*` types didn't provide a `ref` acessor * The HLSL atomic functions were only providing the overloads that return the previous value through an `out` parameter, and not the ones that ignore the previous value. This change fixes these issues with the following changes: * `RWTexture*` types now have a `ref` accessor on their subscript operation which maps to a new `imageSubscript` operation in the IR. By default this translates back to `tex[idx]` in output HLSL, but it makes a custom mapping possible for GLSL * The `Interlocked*` function definitions were expanded to include the overloads without the `out` parameter * GLSL translations were added for the `Interlocked*` functions. These mappings use some new customization points in the intrinsic operation emit logic to support outputting calls to either `atomic*` or `imageAtomic*` as required, and to expand an argument that is a subscript into an image as multiple arguments. This whole approach is quite hacky, and it doesn't seem like the approach we should take in the long run. * Fix: typo in InterlockedAnd lowering One of the cases of `InterlockedAnd` was lowering to `atomicAnd` with a `$0` where we wanted the `$A` substitution to handle the possibility of an image. 25 September 2018, 02:17:12 UTC
32c8479 Remap IROp value ranges * Change the layout of IROp such that 'main' IROps are 0-x. (#649) * Removed MANUAL_RANGE instuction types, as no longer needed. 24 September 2018, 21:20:32 UTC
7250ed1 Remove the "hack sampler" workaround (#648) * Update glslang version * Fix build for new glslang The latest glslang required a few changes to our manual build for their code (because we are *not* taking a dependency on CMake). * Rebuild project files using premake, which picks up a few files added to glslang, but also a few diffs in Slang's own project files in cases where they were edited manually instead of using premake. * Fix up the declaration our our device limits (which are inentionally set to *not* limit what code passes through our glslang), because the underlying structure definition in glslang has changed. This is a kludgy bit of glslang's design, but it doesn't make sense for us to invest in a more serious workaround. * Remove the "hack sampler" workaround When the `GL_KHR_vulkan_glsl` spec was introduced to allow GLSL to be compiled for Vulkan SPIR-V, it made an annoying mistake by leaving a few builtins as taking `sampler2D`, etc. when the equivalent SPIR-V operations only require a `texture2D`, etc. The relevant builtins are: * `textureSize` * `textureQueryLevels` * `textureSamples` * `texelFetch` * `texelFetchOffset` This means that shader code that wanted to use those operations needed to conspire to have a `sampler` handy so they could write, e.g.: ```glsl vec4 val = texelFetch(sampler2D(myTexture, someRandomSampler), p, lod); ``` when what they really wanted was this: ```glsl vec4 val = texelFetch(myTexture, p, lod); ``` That is annoying but probably something each to work around for a GLSL programmer, but when cross-compiling from HLSL, you might have an operation like: ```hlsl float4 val = myTexure.Load(p); ``` in which case a cross-compiler needs to manufacture a sampler out of thin air. If the shader happened to use a sampler for something else you could snag that, but in the worse case you had to cross-compile to GLSL that declared a new sampler. Slang did this by declaring a sampler called `SLANG_hack_samplerForTexelFetch` (because `texelFetch` is the operation that first surfaced the issue). For complex reasons we *always* define this sampler, even if we turn out not to need it in a particular output kernel. This choice has a bunch of annoying consequences: * There is *always* a sampler defined in descriptor set zero, because that's where we put the hack sampler, so a user-defined parameter block always has a set number of 1 or greater (see #646). * The hack sampler shows up in reflection output because users need to size their descriptor sets appropriately to pass along this sampler that won't actually be used if they don't want to get debug spew from the validation layers. We filed an issue on glslang about this problem, and eventually some kind folks from the gamedev community (who also saw the same problem) defined an extension spec (`GL_EXT_samplerless_texture_functions`) to fix the underlying issue and contributed a patch to glslang to make it support that extension. This change just backs the hack out of Slang now that we have a glslang version that supports the extension to get past the defect in the original GLSL-for-Vulkan definition. Besides yanking out the code for the hack, we also change the relevant builtins to declare that they require this new GLSL extension (so that we properly request it from glslang when the builtins are used), and fix some reflection test cases that exposed the existence of the "hack sampler." * Fixup: syntax error in stdlib generator files * Remove more code for hack sampler There was logic to ensure we always have a "default" register space/set when cross-compiling, because the hack sampler would need it. This is no longer necessary once we remove the hack sampler. * Fix expected test output. Fixing the root cause of issue #646 means that one of our test cases that tickles that issue now produces different output (luckily it can now be used as a regression test for the issue). 21 September 2018, 18:12:23 UTC
738bcb8 Improve support for non-32-bit types. (#643) The main change here is to fill out the `BaseType` enumeration so that it covers the full range of 8/16/32/64-bit signed and unsigned integers, as well as 16/32/64-bit floating-point numbers, and then propagate that completion through various places in the code. More details: * The current `half`, `float`, `double`, `int`, and `uint` types are still the default names for their types, so things like `float16_t` and `int32_t` were added as `typedef`s. * We still need to generate the full gamut of vector/matrix `typedef`s for the new types, so that things like `float16_t4x3` will work (yes, I know that is ugly as sin, but that's the HLSL syntax...). * A few pieces of dead code from earlier in the compiler's life got removed, since I did a find-in-files for `BaseType::` and tried to either update or delete every site. * A few call sites that were enumerating integer base types in an ad-hoc fashion were changed to use a single `isIntegerBaseType()` function that I added in `check.cpp` * When compiling with dxc for shader model 6.2 and up, we enable the compiler's support for native 16-bit types via a flag. * The public API enumeration for reflection of scalar types added cases for 8- and 16-bit integers (it already exposed the other cases we need) * The lexer was updated to be extremely liberal in what kinds of suffixes it allows on literals. I also removed the logic that was treating, e.g., `0f` as a floating-point literal (it doesn't seem to be the right behavior). That would now be an integer literal with an invalid suffix. * The logic in the parser that applies types to literals was updated to handle a few more cases: `LL` and `ULL` for 64-bit integers, and `H` for 16-bit floats. * The mangling logic needed to be updated to handle the new cases, and I consolidated the handling of those types in their front-end and IR forms. * Removed the explicit `BasicExpressionType::ToString` logic, since all basic types are `DeclRefType`s in the front end, and we can just print them out as such. * As a bit of a gross hack, fudged the conversion costs so that `int` to `int64_t` conversion is a bit more costly. The problem there is that given an operation like `int(0) + uint(0)`, the best applicable candidates ended up being `+(uint,uint)` and `+(int64_t,int64_t)` because the cost of a single `int`-to-`uint` conversion was the same as the sum of the cost of an `int`-to-`int64_t` and a `uint`-to-`int64_t`. A better long-term fix here is to completely change our overload resolution strategy, but that is obviously way too big to squeeze into this change. * Type layout computation was updated to handle all the new types and give them their natural size/alignment. Note that this does *not* work for down-level HLSL where `half` is treated as a synonym for `float`. It also doesn't deal with the fact that many of these types aren't actually allowed in constant buffers for certain shader models. A future change should work to add error messages for unsupported stuff during type layout (or just make the types themselves require support for certain capabilities) 20 September 2018, 15:14:25 UTC
653fe97 Support for IRStringLit (#645) * * Added support for strings in IR with IRStringLit - with storage of chars after it * Added kIRDecorationOp_Transitory - can be used for detecting instructions constructed on stack * Made IRConstant hashing work off type * Fix comment that is out of date about how an instruction is determines to hold a transitory string. 19 September 2018, 19:27:50 UTC
a37b353 Warn when undefined identifier used in preprocessor conditional (#642) This can mask an error when the user either typos a macro name when writing a conditional, or (as was the case for the user who pointed out this issue) they mistakenly assume that a `#define` in an `import`ed file has been made visible to them. This change just adds the warning in the obvious place, with a test code to ensure it triggers. 19 September 2018, 15:32:28 UTC
091f89a Remove IRDeclRef as recommended in comments for PR #635 as no longer used. (#638) 17 September 2018, 16:01:52 UTC
8a150a9 Control unit tests being run with -category -exclude and using prefix. (#637) Unit tests appear in unit-test category Unit tests 'appear' in a directory unit-tests Removed the -unitTests option 17 September 2018, 14:28:11 UTC
24ad492 Hotfix/fixing warnings (#636) * * Remove dispose from IRInst * Use MemoryArena instead of MemoryPool * Make all IRInst not require Dtor - by having ref counted array store ptrs that need freeing * Increase block size - typically compilation is 2Mb of IR space(!) * Fix issues around StringRepresentation::equal because null has special meaning. * Don't bother to construct as String to compare StringRepresentation, just used UnownedStringSlice. * Added fromLiteral support to UnownedStringSlice and use instead of strlen version. * Use more conventional way to test StringRepresentation against a String. * Fix gcc/clang template problem with cast. * Fix warnings. 17 September 2018, 13:18:57 UTC
3c505c2 Improvements around IR representation and memory usage (#635) * * Remove dispose from IRInst * Use MemoryArena instead of MemoryPool * Make all IRInst not require Dtor - by having ref counted array store ptrs that need freeing * Increase block size - typically compilation is 2Mb of IR space(!) * Fix issues around StringRepresentation::equal because null has special meaning. * Don't bother to construct as String to compare StringRepresentation, just used UnownedStringSlice. * Added fromLiteral support to UnownedStringSlice and use instead of strlen version. * Use more conventional way to test StringRepresentation against a String. * Fix gcc/clang template problem with cast. 14 September 2018, 18:16:28 UTC
e1c9349 Add a better error message for common global generic failure (#634) A common mistake that seems to come up when using global generic type parameters: ```hlsl interface IHero { ... } type_param H : IHero; ParameterBlock<H> gHero; ``` is to accidentally try to specialize the type parameter `H` using `H` itself as the argument (instead of some concrete type like `Batman`). The current front-end checks naively let this pass, because `H` satisfies all the requirements (it sure does declare that it implements `IHero`, which is the only requirement we have). This currently leads to downstream failure when we generate code with generic type parameters still left in the IR. This change implements a simple fix which is to: - Check when we are trying to specialize a global generic parameter using another global generic parameter, since this is currently always a mistake - Add a special-case diagnostic for the 99% case of this failure, which is specializing a type parameter to itself This fix is primarily motivated by the way generics support will initially be implemented in Falcor. 13 September 2018, 21:32:13 UTC
929745d Feature/memory arena improvements (#633) * First pass at MemoryArena. * First pass at RandomGenerator. * Extract TestContext into external source file. * Fix warning on printf. * Use enum classes for Test enums. OutputMode -> TestOutputMode. * First pass at FreeList unit test. * Auto registering tests. Improvements to RandomGenerator. * Remove the need for unitTest headers - cos can use registering. * Added unitTest for MemoryArena. * Do unit tests. * Fix typo. * Fix problem limiting errors from TestContext. * Refactor of MemoryArena * Removed the ability to rewind (to improve memory usage/simplify) * Better memory usage - around oversized blocks + Will keep allocating from a normal block if more than 1/3 memory left, or an oversided block is allocated * Better unitTest coverage for MemoryArena. * Fixes based on code review * Remove e prefix from enum class types for TestContext * Added extra checking for allocations sizes * Fixed some typos * Added std::is_pod test to allocateAndCopyArray * Add include for is_pod needed for linux build. 13 September 2018, 18:02:33 UTC
f60135c Feature/memory arena (#631) * First pass at MemoryArena. * First pass at RandomGenerator. * Extract TestContext into external source file. * Fix warning on printf. * Use enum classes for Test enums. OutputMode -> TestOutputMode. * First pass at FreeList unit test. * Auto registering tests. Improvements to RandomGenerator. * Remove the need for unitTest headers - cos can use registering. * Added unitTest for MemoryArena. * Do unit tests. * Fix typo. 12 September 2018, 20:27:42 UTC
9a97330 Add basic support for #pragma once (#630) * Improve diagnostic messages for function redefinition The front-end was using internal "not implemented" errors instead of friendly user-facing errors to handle: * Redefinition of a function (same signature and both have bodies) * Multiple function declarations/definitions with the same parameter signature, but differnet return types This change simply turns both of these into reasonably friendly errors that explain what went wrong and point to the previous definition/declaration as appropriate. * Add support for detecting #pragma directives and handling them The logic here mirrors what was set up for preprocessor directives, just for "sub-directives" in this case. The only case here is the default one, which now reports a warning for directives we don't understand. * Add basic support for #pragma once Fixes #494 The approach here is simplistic in the extreme. When we see a `#pragma once` directive, we put the current file path (the location of the `#pragma` directive, as reported by our source manager) into a set, and then any paths in that set are ignored by subsequent `#include` directives. This should work for simple cases of `#pragma once`, but it is likely to fail in a variety of cases because our filesystem layer currently makes no attempt to normalize/canonicalize paths. Improving the robustness of the solution is left to future work. This change includes a simple test case to confirm that a second `#include` of a file with a `#pragma once` is successfully ignored. 27 August 2018, 19:33:35 UTC
6a8ad6e Support for [[vk::push_constant]] (#629) * Support for attributed [[vk::push_constant]] and [[push_constant]]. Can also use layout(push_constant). * Fix test so matches the expected output. * Add expected output to binding-push-constant-gl.hlsl * Trivial change to force travis rebuild to test the gcc linux build really has a problem. 22 August 2018, 15:36:02 UTC
0ce1131 Add support for more RasterizerOrdered types (#628) Fixes #627 The front-end has support for `RasterizerOrderedBuffer` and `RasterizerOrderedTexture*`, but left out support for: * `RasterizerOrderedByteAddressBuffer` * `RasterizerOrderedStructuredBuffer` [Nitpick: these tyeps are all amazingly annoying to type. It is easy to want to write `RasterOrdered` instead of the bulkier `RasterizerOrdered`, and almost everybody does in casual speech. There's already the issue of wanting to type `StructureBuffer` (a buffer of structures) instead of `StructuredBuffer` (a buffer that is... structured?). Then you have `ByteAddressBuffer` which is just adding to the confusion because it is nominally a "byte addressable" buffer (so that `ByteAddressedBuffer` would actually make sense), but then actually *isn't* byte addressable in practice.] There were a few `TODO` comments related to this already, and this change was mostly a matter of doing a find-in-files for `RWByteAddressBuffer` and `RWStructuredBuffer` and adding matching `RasterizerOrdered` cases. The test I added just checks that these types make it through the front-end, and doesn't do any actual confirmation that they work as intended. It is worth noting that the handling of ordering in GLSL/VK is different from in HLSL ("pixel shader interlock" instead of "rasterizer ordered views"), so coming up with a cross-compilation story would need to be a later step. 21 August 2018, 15:40:25 UTC
56d8a75 Improve model-viewer support for lights (#626) * Improve model-viewer support for lights The main visible change here is that the model-viewer example supports multiple light sources, with a basic UI for adding new light sources to the scene, and for manipulating the ones that are there. Along the way I also refactored the `IMaterial` decomposition to be a bit less naive, while still only including a completely naive Blinn-Phong implementation. I also went ahead and spruced up the `cube.obj` file so that it has multiple materials, although it is still a completely uninteresting asset. * Fixup: Windows SDK version 11 August 2018, 05:21:44 UTC
73ff690 Add basic support for "Dear IMGUI" (#625) This isn't being made visible just yet, but it will allow us to have a simple UI for loading models into the model-viewer example. In order to support rendering with IMGUI I had to add the following to the `Renderer` layer: * viewports * scissor rects * blend support These are really only fully implemented for D3D11, but adding them to the other back-ends should be a reasonably small task. 06 August 2018, 22:52:38 UTC
68d705f Major overhaul of Renderer abstraction, to support a new example (#624) The original goal here was to bring up a second example program: `model-viewer`. While the existing `hello-world` example is enough to get somebody up to speed with the basics of the Slang API (as a drop-in replacement for `D3DCompile` or similar), it doesn't really show any of the big-picture stuff that Slang is meant to enable. There wasn't any use of D3D12/Vulkan descriptor tables/sets, and there wasn't any use of interfaces, generics, or `ParameterBlock`s in the shader code. The `model-viewer` example addresses these issues. Its shader code involves generics, interfaces, and multiple `ParameterBlock`s, and the host-side code demonstrates a few key things for working with Slang: * There is an application-level abstraction for parameter blocks, that combines the graphics-API descriptor set object with Slang type information * There is a shader cache layer used to look up an appropriate variant of a rendering effect by using parameter block types to "plug in" global type variables * There is a clear separation between the phases of compilation: a first phase that does semantic checking and enables reflection-based allocation of graphics API objects, followed by one or more code generation passes for specialized kernels. This example is certainly not perfect, and it will need to be revamped more going forward. In particular: * The output picture is ugly as sin. We need a plan for how to get this to load better content, perhaps even popping up an error message to note that the required input data isn't present in the basic repository. * The shader code is too simplistic. There isn't any real material variety, and the `IMaterial` abstraction is completely wrong. * The use of parameter blocks is facile because there are no resource parameters right now. Fixing that will likely expose issues around interfacing with Slang's reflection API. * The whole example exposes the issue that Slang's current APIs aren't really designed for the benefit of two-phase compilation (since our many client application has been stuck on one-phase compilation). * Global type parameters are actually a Bad Idea that we only did for compatibility with existing codebases. We should not be showing them off in an example of the Right Way to use Slang, but the language support for type parameters on entry points is still not complete. Of course, the majority of the changes here are *not* inside the example applications, and instead involve a major overhaul of the `Renderer` abstraction that is used for both tests and examples. The main thrust of the change is to make the abstraction layer be closer to the D3D12/Vulkan model than to a D3D11-style model. This is important for the `model-viewer` example, since it aspires to show how Slang can be incorporated into a renderer that targets a modern API. The most important bit is actually the use of descriptor sets and "pipeline layouts" a la Vulkan, since without these Slang's `ParameterBlock` abstraction won't make a lot of sense. Implementation of the abstraction for the various APIs has very much been on an as-needed basis. The current implementation is just enough for the two examples to work, plus enough to get all the tests to pass in both debug and release builds on Windows. A big missing feature in the API abstraction right now is memory lifetime management. The code had been trending toward something D3D11-like where a constant buffer could be mapped per-frame with the implementation doing behind-the-scenes allocation for targets like D3D12/Vulkan. I'd like to shift more toward a model of just exposing "transient" allocations that are only valid for one frame, because these are more representation of how an efficient renderer for next-generation APIs will work. That transition isn't actually complete, though, so there are problems with the existing examples where `hello-world` is actually scribbling into memory that the GPU might still be using, while `model-viewer` is doing full-on heavy-weight allocations on a per-frame basis with no real concern for the performance implications. All together, there are a lot of things here that need more work, but this branch has been way too long-lived already, and so I'd like to get this checked in as long as all the tests pass. 03 August 2018, 15:39:28 UTC
5ea746a Fix imageStore output for types other than 4-vectors (#622) Fixes issue #620 Given a `RWTexture*` store operation like: ```hlsl RWTexture3D a<float>; ... float x = 1.0f; a[crd] = x; ``` We were generating output GLSL like: ```glsl layout(rgba32f) image3D a; ... float x = 1.0f; imageStore(a, crd, x); ``` but in that case, the `imageStore` operation expected a `vec4` and not a `float` for the last argument, and we fail GLSL compilation. This change extends our handling of the `imageStore` operation in the stdlib so that we pad out the last argument if it is not a 4-vector. We also flesh out the code that was picking a `layout(...)` modifier for image formats so that it doesn't just blindly use `layout(rgba32f)` and instead takes the element type fed to `RWTexture3D<...>` into account. With these two changes, the above HLSL/Slang code now translates to: ```glsl layout(r32f) image3D a; ... float x = 1.0f; imageStore(a, crd, vec4(x, float(0), float(0), float(0))); ``` Note that we are padding out the `x` argument to a full vector, and also that we declare the image with `layout(r32f)` to reflect the fact that it has only as single channel. 31 July 2018, 16:02:22 UTC
9914ca8 Feature/attributed binding (#621) * Typo fix, and added dxc to command line documentation. * Fix small typos. Added support for Scope to lexer. Fix bug in Token ctor. * Add support for attribute names that are scoped. * Added GLSLBindingAttribute. Make binding work through core.met.slang. * Allow [[gl::binding(binding, set)]] [[vk::binding(binding,set)]] 31 July 2018, 15:03:53 UTC
171f524 Fix translation of RWTexture subscript operations for Vulkan (#618) Partially fixes #615 There's kind of a mess going on here, and it is difficult to be sure which of the changes here are strictly necessary. Also, our testing isn't setup to run tests that use `RWTexture2D`, so the only testing I can really run is manual tests using Falcor. The most basic issue here is that in an earlier change I added `ref` accessors for the subscript operation on various `RW*` types in the standard library, and that included `RWTexture2D` (and the other `RWTexture*` types). The compiler ended up favoring a `ref` accessor over a `set` accessor even when the `set` would suffice, but only the `set` accessor could be lowerd to GLSL/SPIR-V. This change ends up implementing two different fixes for the same problem: * Logic has been added to try and favor a `set` accessor over a `ref` accessor in the cases where either could be used (but still require a `ref` accessor to be used when it is really needed) * The `ref` accessor for `RWTexture*` has been removed, since it turns out that the operations that might have benefited from it (atomics, and component-granularity stores) aren't actually allowed on typed UAVs anyway. There is a deeper issue here that somebody needs to go through and rationalize our representation and handling of accessors like this, but I'm not going to be able to do that in the time I can put into this PR. 26 July 2018, 21:48:48 UTC
66f5f18 Fix implicit flat interpolation for GLSL output (#619) There was some logic called `maybeEmitGLSLFlatModifier()` that was supposed to emit an implicit `flat` modifier for any varying shader parameter with an integer type that wasn't qualified as `nointerpolation` in the input HLSL/Slang (where `nointerpolation` is the equivalent of `flat`). This wasn't being triggered because I apparently added code to only emit the implicit modifier if there was no explicit one, but then I had this code: ```c++ bool anyModifiers = false; anyModifiers = true; ... if(!anyModifiers && ...) { maybeEmitGLSLFlatModifier(); } ``` Unsurprisingly, the `anyModifiers = true` line meant that things never actually triggered. Once I fixed that issue the next problem that arose was that the `maybeEmitGLSLFlatModifier()` logic was being applied to *any* varying integer parameter, which includes fragment outputs, but GLSL forbids the `flat` modifier on fragment outputs and so gave an error on a shader that wrote to an integer target. I fixed up the logic to take computed layout for a shader parameter into account, and only emit the `flat` modifier for fragment *inputs*. As the `TODO` commend at that location notes, there may be some arcane rules about how a vertex shader also needs to use `flat` when declaring the matching output, so we may need to make that test more careful down the road. For now the shader that originally surfaced this problem now works under Vulkan. 26 July 2018, 20:36:28 UTC
dff216c Improved command line control for apis and synthesized tests (#616) * Parsing of control of api parameters no longer needs comma separator. Parsing of API list now can take an initial state. Document the command line option. * * Proper handling of 'default' (or initialFlags) - by using if the first token is an operator. * Clarified parsing of api flags. * Now 'vk' will mean just use vk. +vk will mean the defaults plus vk, and -vk is the defaults -vk. * Improve README.md on api expressions. Improve error text for failure to parse api expressions. 25 July 2018, 15:57:44 UTC
990ed73 Fix problem when doing options parse, that failure doesn't leave appropriate message in diagnostic string. (#612) 17 July 2018, 17:36:19 UTC
7b2a549 spCompile/spProcessCommandLineArguments return SlangResult (#610) * * Make spCompile return SlangResult * Make spProcessCommandLineArguments return SlangResult (and not internally exit) * Remove calls to exit() * Fix typos * Make all output from spProcessCommandLineArguments get sent to diagnostic sink. 06 July 2018, 15:51:19 UTC
338a770 Fix up definitions of half and double in stdlib (#608) An earlier change made sure that the `half` and `double` types properly conform to the `__BuiltinFloatingPointType` and `__BuiltinRealType` interfaces, but somehow that change modified only the *generated* source file (`core.meta.slang.h`) and not the source that fed into the generator (`core.meta.slang`). This meant that when building the compiler, we'd end up with spurious diffs because we'd run the generation logic and clobber the (correct) output file with freshly generated (wrong) code. This change adds the missing lines to the source file to fix up the issue. 28 June 2018, 19:33:03 UTC
dfe13b5 Share graphics API layer between tests/examples (#603) The `render-test` project has an in-progress graphics API abstraction layer, and it makes sense to share this code with our examples rather than write a bunch of redundant code between examples and tests. Most of this change is just moving files from `tools/render-test/*` to a new library project at `tools/slang-graphics/`. The most complicated code change there is renaming from `render_test` to `slang_graphics`. The existing `hello` example was ported to use the graphics API layer instead of raw D3D11 API calls. It is still hard-coded to use the D3D11 back-end and the `SLANG_DXBC` target, so more work is needed if we want to actually support multiple APIs in the examples. I also went ahead and implemented an extremely rudimentary set of APIs to abstract over the Windows platform calls that were being made in the example, so that we could potentially run that same example on other platforms. I did *not* port `render-test` to use those APIs, and I also did not implement them for anything but Windows (my assumption is that for most other platforms we would just use SDL2, and require people to ensure it is installed to their machine before building Slang examples). 28 June 2018, 18:14:48 UTC
22033f0 Support for Tessellation (#607) * Fix typo OuptutTopologyAttribute -> OutputTopologyAttribute First pass support for handing tesselation shaders - domain and hull. * Added attribute PatchConstantFuncAttribute * Added visitHLSLPatchType(HLSLPatchType* type) such that the patch type template parameters are handled * Added IRNotePatchConstantFunc - such that the patch constant function is referenced within IR * Added support for outputing typical tesselation attributes (although minimal validation is performed) * Added findFunctionDeclByName * Small improvements to diagnostic. * Improved diagnostics and checking for geometry shader attributes. * Added diagnostic if patchconstantfunc is not found Handle assert failure when outputing a domain shader alone and therefore attr->patchConstantFuncDecl is not set. * Simple script tess.hlsl to test out domain/hull shaders. * Added url for where hull shader attributes are defined. * Fix unsigned/signed comparison warning. * Restore removal of fix in "Improve generic argument inference for builtins (#598)" * Update tessellation test case to compare against fxc The test was previously comparing against fixed expected DXBC output, but this caused problems when the test runner tried to execute the test on Linux (where there is no fxc to invoke...), and would also be a potential source of problems down the road if different users run using different builds of fxc. The simple solution here is to convert the test to compare against fxc output generated on the fly. That test type is already filtered out on non-Windows builds, so it eliminates the portability issue (in a crude way). I also changed the test to compile both entry points in one compiler invocation, just to streamline things into fewer distinct tests. * Eliminate unnecessary call to `lowerFuncDecl` In a very obscure case this could cause a bug, if the patch-constant function had somehow already been lowered (because it was called somewhere else in the code). The call should not be needed because `ensureDecl` will lower a declaration on-demand if required, so eliminating it causes no problems for code that wouldn't be in that extreme corner case. 27 June 2018, 20:53:47 UTC
4bbd0e7 Feature/com helper (#606) * Added Result definitions to the slang.h * Removed slang-result.h and added slang-com-helper.h * Move slang-com-ptr.h to be publically available. * Add SLANG_IUNKNOWN macros to simplify implementing interfaces. Use the SLANG_IUNKNOWN macros to in slang.c * Removed slang-defines.h added outstanding defines to slang.h * Include slang-com-ptr.h and slang-com-helper.h in archives built with CI. * Use spaces instead of tabs on appveyor.yml * Put operator== and != for Guid in global namespace. * Fix binary windows archive to have all the slang headers. 22 June 2018, 21:36:59 UTC
4fa0111 Feature/com helper (#605) * Added Result definitions to the slang.h * Removed slang-result.h and added slang-com-helper.h * Move slang-com-ptr.h to be publically available. * Add SLANG_IUNKNOWN macros to simplify implementing interfaces. Use the SLANG_IUNKNOWN macros to in slang.c * Removed slang-defines.h added outstanding defines to slang.h * Include slang-com-ptr.h and slang-com-helper.h in archives built with CI. * Use spaces instead of tabs on appveyor.yml 22 June 2018, 19:06:41 UTC
d0c9571 Expose macros/functionality for defining interfaces (#604) * Added Result definitions to the slang.h * Removed slang-result.h and added slang-com-helper.h * Move slang-com-ptr.h to be publically available. * Add SLANG_IUNKNOWN macros to simplify implementing interfaces. Use the SLANG_IUNKNOWN macros to in slang.c * Removed slang-defines.h added outstanding defines to slang.h 22 June 2018, 17:09:01 UTC
e66d66b Add support for "blobs" and a file-system callback (#596) * Add support for "blobs" and a file-system callback The most obvious change here is that the Slang header now includes a few COM-style interfaces that can be used for communication between the application and compiler. In order to support the declaration of COM-like interfaces, several platform-detection macros were lifted out of `slang-defines.h` and into the public `slang.h` header. As it exists right now, this change makes the Slang API C++-only, but a C-compatible version can be defined later with the help of lots of macros (and/or something like an IDL compiler). The two big interfaces introduced are: * The `ISlangBlob` interface, which is compatible with `ID3DBlob`, `IDxcBlob`, etc. This is used to pass ownership of source/compiled code across the API boundary without copies. New versions of various entry points have been added to allow passing blobs: e.g., `spAddTranslationUnitSourceBlob` and `spGetEntryPointCodeBlob`. * The `ISlangFileSystem` interface, which is used to allow applications to intercept any attempt by the Slang compiler to load a file (input source files, include files, etc.). This is *not* the same as the `IDxcIncludeHandler` interface, because it assumes UTF-8 encoded path names, instead of the 16-bit encoding that dxc/Windows prefer. It is also not very similar to `ID3DInclude` as used by fxc, because this callback interface is *not* responsible for handling the search through include paths, etc. - it is just a file-system abstraction layer. Internally, a few different parts of the compiler were changed to either store data in blob form all the time, or to be able to synthesize a blob on-demand. Because our internal `String` type is a reference-counted copy-on-write type, using a `SlangStringBlob` to hold string data should achieve transfer of ownership back to the application without extraneous copies. There is plenty of room to clean up the architecture of some of these internal pieces if they *know* that their data will end up in a blob. The existing Slang testing doesn't touch any of the APIs introduced here, so they can only confirm that existing functionality hasn't been broken. The new ability to return code blobs has been tested by integration of that feature into Falcor, but there has been zero testing of the ability to pass *in* source code as blobs, and the ability to hook file loading. Future changes will need to add test coverage for the new features. * fixup: define SLANG_NO_THROW for non-Windows builds * fixup: header copy-paste error caught by clang/gcc * Cleanup: return reference-counted objects via output parameters Returning a reference-counted object through the API as a raw pointer creates challenges. The "obvious" answer is that the returned pointer should have an added reference (it is returned at "+1"), and the caller is responsible for releasing that reference. This makes sense when using raw pointers on the calling side: ```c++ IFoo* foo = spGetFoo(...); ... foo->Release(); ``` However, as soon as smart pointers start getting involved (to handle releasing reference counts when we are done with things), the picture gets more complicated: ```c++ MySmartPtr<IFoo> foo = spGetFoo(...); ... ``` The intention of code like that is that `foo` gets released when the smart pointer goes out of scope, but this probably doesn't happen with most smart pointer implementations. If the `MySmartPtr` constructor that takes a raw pointer retains it, then the destructor will only release *that* reference, and so the object will leak. It is possible that the user will have a smart pointer type where the constructor that takes a raw pointer doesn't retain it, but in general such types introduce the potential for errors of their own, and no matter what the Slang API shouldn't go in assuming any particular policy. This change makes it so that any reference-counted objects that are logically returned from a call are returned through output pointers. This design makes the leak-free cases easy (enough) to implement with raw pointers or smart pointers: ```c++ // raw pointer IFoo* foo = nullptr; spGetFoo(..., &foo); ... foo->Release(); // smart pointer MySmartPtr<IFoo> foo; spGetFoo(..., foo.writeableRef()); ... ``` The only assumption here is that any COM smart-pointer type needs to provide an operation like `writableRef` that is suitable for using that pointer as an output parameter. Given that COM *loves* output parameters, this seems like a safe assumption (at the very least, anybody who interacts with COM would be used to this convention). Future changes might introduce inline convenience methods for various operations that return results more directly, possibly by introducing a minimal smart-pointer type in the `slang.h` header (without prescribing that clients must use it...). * fixup: another error caught by gcc/clang 14 June 2018, 18:56:31 UTC
126e75d Improve generic argument inference for builtins (#598) Fixes #487 The basic problem here is that the user writes something like: ```hlsl float invSqrt2 = 1 / sqrt(2); ``` In this case the user knows that `sqrt()` is only defined for floating-point types, so they expect this to compile something like: ```hlsl float invSqrt2 = float(1) / sqrt(float(2)); ``` The challenge this creates for the Slang compiler is that we use generics to streamline our declarations of all the builtins, so that the scalar `sqrt()` function is actually declared as: ```hlsl T sqrt<T:__BuiltinFloatingPointType>(T value); ``` The `__BuiltinFloatingPointType` is an `interface` defined as part of the standard library, such that only built-in floating-point types conform to it (that is, `half`, `float`, and `double`). When generic argument inference applies to a call like `sqrt(2)`, we see an argument of type `int`, and try to infer `T=int`, which leads to a failure because `int` does not conform to `__BuiltinFloatingPointType`. The point where this currently fails in in the logic to "join" two types for inference, which is supposed to pick the best type that can represent both of two input types. E.g., a join between `float` and `int3` would be `float3`, since both of those types can convert to it, and it is the "minimal" type with that property. So, the goal here is simple: we want a "join" between `int` and `__BuiltinFloatingPointType` to yield the `float` type. The way we handle that in this change is to special case the join of a basic scalar type and an interface, by enumerating all the basic scalar types, filtering them for ones that support the chosen interface and can be implicitly converted from the argument type, and then picking the "best" of them (the comments in the code explain what "best" means in this context). The technique used here could be generalized in the future to deal with user-defined types or more cases, but that would risk slowing down overload resolution even more, which is already the most expensive part of our semantic checking pass. A test case has been added for the specific case of `sqrt()` applied to an `int` argument. 14 June 2018, 14:29:45 UTC
77562ef Make render-test use Slang for all shader compilation (#597) * Make render-test use Slang for all shader compilation This streamlines the code for render-test by having all its shader compilation go through the Slang API, so that it doesn't have to deal with custom logic to compile HLSL->DXBC and HLSL->DXIL. We were already leaning on Slang to generate SPIR-V for Vulkan, so this makes all the paths more consistent. My original plan with this change was to make the D3D12 render path start using DXIL at this point, since the change would make that easy, but it turns out that some aspects of how we handle parameter binding are not compatible with that right now, so it would need to come as a later change. There's a lot of details here, so I will try to walk through the changes, including the incidental ones: * Add logic to `premake5.lua` so that we copy the necessary libraries for HLSL shader compilation to our target directory from the Windows SDK. This is necessary so that our tests can actually invoke `dxcompiler.dll` * Re-run Premake to generate new project files. This moves around a few files that I manually added in previous changes without re-running Premake. * When invoking `fxc` as a pass-through compiler, be sure to pass along any macros defines via API or command-line. This isn't a strictly required change with how things worked out, but it is a positive one anyway, because it makes `slangc -pass-through fxc` more useful. * Don't print output from a downstream `fxc` invocation if it produces warnings but no errors. The main reason for this is so that our tests don't fail because of `fxc` warnings on Slang's output (which then don't match the baselines), but it can also be rationalized as not wanting to confuse users with warnings that don't come from the "real" compiler they are using. This probably needs fine-tuning as a policy. * Add the HLSL `NonUniformResourceIndex` function. This was an oversight because it isn't documented as a builtin on MSDN, and only gets mentioned obliquely when they talk about resource indexing. * Add `glsl_<version>` profiles to match our `sm_<version>` profiles, so that it is easy for a user to use the profile mechanism to request a specific GLSL version without also specifying a stage name. * Update the render-test logic so that there is a single `ShaderCompiler` implementation that *always* uses Slang, and get rid of all of the renderer-specific `ShaderCompiler` implementations. * Update logic in render-test `main.cpp` to select the options to use for the eventual Slang compile based on the choice of renderer and input language. I didn't change the options that render-test exposes, even though they are getting increasingly silly (e.g., `-glsl-rewrite` doesn't use GLSL as its input...). * Note: the D3D12 renderer will still use fxc, DXBC, and SM 5.0 for now, since trying to update it to switch to dxc, DXIL, and SM 6.0 didn't work well at the time. * Add a bit of supporting D3D12 code to make sure that we don't allocate a structured buffer when a buffer has a format. * Make sure to *also* define the `__HLSL__` macro when compiling Slang code, because otherwise a bunch of tests don't work (I'm not clear on how it worked before...). * fixup: missing file 13 June 2018, 22:39:04 UTC
a4dd936 Fix some issues around codegen for l-values and assignment (#601) The problem here arose when a complicated l-value was formed like: ```hlsl struct Foo { float4 a; } RWStructuredBuffer<Foo> gBuffer; gBuffer[index].a.xz += whatever; ``` In this case the `gBuffer[index].a.xz` expression is a complex l-value in multiple ways: * The `gBuffer[index]` subscript could be routed to either a `get` accessor or a `ref` accessor (and maybe also a `set` accessor if we add one to the stdlib definition), and we defer the choice of which to call until as late as possible in codegen today. * The `_.a` part then becomes a "bound member acess" because we can't actually produce a direct pointer until we've resolved how to implement the subscript operation. * The `_.xz` part becomes a "swizzled l-value" because there is *no* way to materialize it as a pointer to contiguous storage in the orignal object (the `x` and `z` components of a vector aren't contiguous). Recent changes to support atomic operations on buffer elements introduced the `ref` accessor on `RWStructuredBuffer`, which made it possible to form a pointer to a buffer element in the IR. This interacted with some code for the "bound member" case that was trying to only introduce a temporary when absolutely necessary, and was doing so by assuming anything with an address didn't need to be moved into a temporary. The first fix is to clean up that logic in the bound-member case for assignment: always create a temporary, rather than do it conditionally. The second fix here is more systemic: we add logic to try to coerce the representation of an l-value during codegen into being a simple address, and employ that in cases where we know an address is desired. In a case like the above this helps to get things into the form that is required, so that a swizzled store can be issued. There is still some potential for cleanup in this logic, but I don't want to introduce more changes than seem necessary to fix the original problem. 13 June 2018, 20:56:30 UTC
860b0d6 Fixes related to handling of empty types (#600) PR #577 tries to eliminate empty `struct` types by replacing them with a `LegalType::tuple` with zero elements, but this seems to run into problems in some cases, where we end up trying to match up `::none` values with empty `::tuple`s. An alternative way to handle this is to never create empty `LegalType::tuple`s (and the same for `LegalVal::tuple`), and instead create `LegalType::none` and `LegalVal::none`. PR #577 avoided this because there were various cases in the legalization logic that didn't robustly handle `LegalType::Flavor::none`. This PR thus includes two main changes: 1. Construct a `::none` type when we have an empty `struct` type. 2. Survery all places that handle the `::tuple` case and extend them to handle the `::none` case if it was missing. This fixes an issue filed in Falcor's internal GitLab as number 424. 13 June 2018, 18:18:44 UTC
167d857 Initial support for enum declarations (#599) Slang `enum` declarations will always be scoped, e.g.: ```hlsl enum Color { Red, Green = 2, Blue, } Color c = Color.Red; // Not just `Red` ``` A user can write `enum class` as a placebo for now (to ease sharing of headers with C++). Slang does not currently support the `::` operator for static member lookup, so it must be `Color.Green` and not `Color::Green`. Support for `::` as an alternate syntax could be added later if there is strong user demand. An `enum` type can have a declared "tag type" using syntax like C++ `enum class`: ```hlsl enum MyThings : uint { First = 0, // ... } ``` The `enum` cases will store their values using that type. An `enum` that doesn't declare a tag type will use the type `int` by default. Enum cases are assigned values just like in C/C++: cases can have explicit values, but otherwise default to one more than the previous case, or zero for the first case. All `enum` types will automatically conform to a standard-library `interface` called `__EnumType`, which is used so that basic operators like equality testing can be defined generically for all `enum` types. This change only adds one operator at first (the `==` comparison), but other should be added later. An `enum` case needs to be explicitly converted to an integer where needed (e.g., `int(Color.Red)`). This is implemented by having the main integer types (`int` and `uint`) support built-in initializers that can work for *any* `enum` type (or rather, anything conforming to `__EnumType`). Eventually these will be restricted so that an `enum` type can only be converted to its associated tag type. IR code generation completely eliminates `enum` types and their cases. The `enum` type will be replaced with its tag type, and the cases will be replaced with the tag values. Currently this could leave some mess in the IR where cast operations are applied between values that actually have the same type. 12 June 2018, 21:59:13 UTC
7852a2b Add basic support for Shader Model 6.3 profiles (#594) * Add basic support for Shader Model 6.3 profiles This adds `vs_6_3` and friends as available profiles, but doesn't add any new builtins specific to Shader Model 6.3. In order to better support the ray tracing shader stages, Slang will not automatically map any attempt to compile a DXR shader up to SM 6.3 (the shader model officially required for these stages) and to the `lib_*` profiles (because there are no stage-specific profiles for these cases). As an added detail, when invoking `dxcompiler.dll` to generate DXIL for DXR shaders, specify an empty entry-point name, since that is expected for `lib_*` profiles. * Fixup: don't drop [shader(...)] attributes The previous change makes the "effective profile" for DXR compiles no longer include a stage, but we had been using the stage stored on the effective profile in exactly one place: when determining what to output for a `[shader("...")]` attribute. This fixup makes it so that we use the stage from the profile on the entry-point layout instead, which seems like the right choice anyway, if we are ever going to emit multiple entry points at once. 06 June 2018, 18:59:32 UTC
1a69812 Fix atomic operations on RWBuffer (#593) * Fix atomic operations on RWBuffer An earlier change added support for passing true pointers to `__ref` parameters to fix the global `Interlocked*()` functions when applied to `groupshared` variables or `RWStructureBuffer<T>` elements. That change didn't apply to `RWBuffer<T>` or `RWTexture2D<T>`, etc. because those types had so far only declared `get` and `set` accessors, but not any `ref` accessors (which return a pointer). The main fixes here are: * Add `ref` accessors to the subscript oeprations on the `RW*` resource types * Adjust the logic for emitting calls to subscript accessors so that we don't get quite as eager about invoking a `ref` accessor, and instead try to invoke just a `get` or `set` accessor when these will suffice. This is important for Vulkan cross-compilation, where we don't yet support the semantics of our `ref` accessors. * Add a test case for atomics on a `RWBuffer` * Fix up `render-test` so that we can specify a format for a buffer resource, which allows us to use things other than `*StructuredBuffer` and `*ByteAddressBuffer`. The work there is probably not complete; I just did what I could to get the test working. * A bunch of files got whitespace edits thanks to the fact that I'm using editorconfig and others on the project seemingly arent... * fixup: remove ifdefed-out code 06 June 2018, 04:35:48 UTC
8b16bbf Emit directives to control matrix layout (#590) The HLSL/GLSL output by Slang should try to be robust against whatever flags somebody uses to compile it. Therefore, we will go ahead and output a target-language-specific directive to control the default matrix layout mode so that we can override whatever might be specified via flags. Also, as long as we are at it, this change goes ahead and makes Slang unconditionally emit row/column-major layout modifiers on all matrices (and arrays of matrices) whereas before these were only being output sometimes (the code to do it seemed buggy to me...). 04 June 2018, 22:12:56 UTC
698ba86 1st stage renderer binding refactor (#587) * First pass at support for textures in vulkan. * Binding state has first pass support for VkImageView VkSampler. * Split out _calcImageViewType * Fix bug in debug build around constant buffer being added but not part of the binding description for the test. * Offset recalculated for vk texture construction just store the texture size for each mip level. * When outputing a vector type with a size of 1 in GLSL, it needs to be output as the underlying type. For example vector<float,1> should be output as float in GLSL. * Vulkan render-test produces right output for the test tests/compute/textureSamplingTest.slang -slang -gcompute -o tests/compute/textureSamplingTest.slang.actual.txt -vk * Small improvement around xml encoding a string. * More generalized test synthesis. * Fix image usage flags for Vulkan. * Improvements to what gets synthesized vulkan tests. * Do transition on all mip levels. * Fixing problems appearing from vulkan debug layer. * Disable Vulkan synthesized tests for now. * Add Resource::Type member to Resource::DescBase. * Removed the CompactIndexSlice from binding. Just bind the indices needed. * BindingRegister -> RegisterSet * RegisterSet -> RegisterRange * Typo fix for debug build. * Remove comment that no longer applied. 01 June 2018, 14:41:13 UTC
8d77db3 Add options to control matrix layout rules (#583) * Add options to control matrix layout rules Up to this point, the Slang compiler has assumed that the default matrix layout conventions for the target API will be used. This means column-major layout for D3D, and *row major* layout for GL/Vulkan (note that while GL/Vulkan describe the default as "column major" there is an implicit swap of "row" and "column" when mapping HLSL conventions to GLSL). This commit introduces two main changes: 1. The default layout convention is switched to column-major on all targets, to ensure that D3D and GL/Vulkan can easily be driven by the same application logic. I would prefer to make the default be row-major (because this is the "obvious" convention for matrices), but I don't want to deviate from the defaults in existing HLSL compilers. 2. Command-line and API options are introduced for setting the matrix layout convention to use (by default) for each code generation target. It is still possible for explicit qualifiers like `row_major` to change the layout from within shader code. I also added an API to query the matrix layout convention that was used for a type layout (which should be of the `SLANG_TYPE_KIND_MATRIX` kind), but this isn't yet exercised. I added a reflection test case to make sure that the offsets/sizes we compute for matrix-type fields are appropriately modified by the flag that gets passed in. In a future change we could possibly switch the default convention to row-major, if we also changed our testing to match, since there are currently not many clients to be adversely impacted by the change. * Fixup: silence 64-bit build warning 31 May 2018, 17:50:28 UTC
8c593ae GroupMemoryBarrierWithGroupSync only works on groupshared memory - it doesn't block on global memory accesses. The fix is to copy the values to be processed by InterlockedAdd into shared array. The previous test ran successfully on Dx11, but broke on Dx12. (#586) 30 May 2018, 15:32:26 UTC
8b67c7b Feature/vulkan texture (#579) * First pass at support for textures in vulkan. * Binding state has first pass support for VkImageView VkSampler. * Split out _calcImageViewType * Fix bug in debug build around constant buffer being added but not part of the binding description for the test. * Offset recalculated for vk texture construction just store the texture size for each mip level. * When outputing a vector type with a size of 1 in GLSL, it needs to be output as the underlying type. For example vector<float,1> should be output as float in GLSL. * Vulkan render-test produces right output for the test tests/compute/textureSamplingTest.slang -slang -gcompute -o tests/compute/textureSamplingTest.slang.actual.txt -vk * Small improvement around xml encoding a string. * More generalized test synthesis. * Fix image usage flags for Vulkan. * Improvements to what gets synthesized vulkan tests. * Do transition on all mip levels. * Fixing problems appearing from vulkan debug layer. * Disable Vulkan synthesized tests for now. 29 May 2018, 20:48:04 UTC
e7a8332 Fix global atomic functions (#582) Fixes #581 This change adds a new parameter passing mode `__ref` to exist alongisde `in`, `out`, and `inout`. The `__ref` modifier indicates true by-reference parameter passing (whereas `inout` is copy-in-copy-out). This is not intended to be something that users interact with directly, but rather a low-level feature that lets us provide a correct signature for the `Interlocked*()` operations in the standard library. Most of the support for passing what are logically addresses around already exists in the IR, so the majority of the work here is just in introducing the new type `Ref<T>` and then using it appropriately when lowering `__ref` parameters/arguments to the IR. 29 May 2018, 18:39:55 UTC
ace9a8d Fixes 574. Eliminate empty structs during type legalization (#577) 25 May 2018, 14:01:34 UTC
18709fb A bunch of work to resolve #569 (#576) * render-test should not fail on HLSL compiler *warnings* The logic in `render-test` that invokes `D3DCompile` was causing a test to fail if it produced any warnings (not just if compilation fails). Warning output can be dealt with by the test runner, since it will compare output between runs anyway, and it is useful to be able to run something through `render-test` that compiles with warnings. * Be more careful about deleting IR instructions There was an `IRInst::deallocate()` method that had a precondition that the instruction should already be removed from its parent and clear out all its operands before calling, but it wasn't checking this and the few call sites weren't doing things right either. I consolidated things on `IRInst::removeAndDeallocate()` which does all the things: removes from the parent, clear out operands, and then deallocates. I also made sure to clear out the type operand. This clears up some crashing issues where passes were removing instructions but those instructions would still show up as users of other instructions. * Don't emit bitwise not for non-Boolean types It seems like the logic in `emit.cpp` messed things up and decided that `Not` (the IR instruction that is equivalent to `!` in the AST) should emit as `!` for Boolean types and `~` for other types, but this makes no sense (e.g., `~(a & 1)` is very different from `!(a & 1)`, even when interpreted as a condition). It seems like this logic was intended for the `BitNot` case, where `~a` and `!a` are actually equivalent for Boolean values (but a target language might not like `~a` on `bool` values). Maybe the original plan was that the `Not` instruction should only apply to Boolean values in the first place, and that other values should be converted to `bool` (or a vector of `bool`) before applying `Not`, but even in that case the emit logic makes no sense. This caused an actual problem for one of my test cases, so it was important to fix it now. * Fix issue with cached resolution for overoaded operators The basic problem was that the lookup logic was forming a key based on the *first* definition it found for the overloaded operator, but that means that when processing a prefix `++a` call we might look up the *postfix* definition of `operator++` and decide to use its opcode as the key. This "fixes" the logic by looking for the first definition with a "compatible" definition (e.g., a `__prefix` function if we are checking a `PrefixExpr`), and then uses its opcode. A better fix in the long run would be to make the cache just be keyed on the operator name and the "fixity" of the expression (prefix, postfix, or infix). * Introduce an intermediate structured control-flow representation The code previously used a single function called `emitIRStmtsForBlocks` in `emit.cpp` that would take a logical sub-graph of the CFG and emit it as high-level statements. It would do this by recognizing operations like coniditional branches that it could turn into high-level `if` statements, etc. The main problem with this function was that it mixed together the logic for how we restructure the program with the logic for how we emit high-level code from that structure. This change splits those two parts of the algorithm by introducing an intermediate data structure: a tree of `Region`s, which represent single-entry regions of the CFG. There are subclasses of `Region` corresponding to various structured control-flow constructs, and then a leaf case that wraps a single `IRBlock`. The new function `generateRegionsForIRBlocks()` (in `ir-restructure.cpp`) now handles the restructuring work, by building one or more `Region`s to represent a sub-graph, while `emitRegion()` handles emitting HLSL/GLSL source code from a region. Splitting things in this way opens up some opportunities for future changes: * We can expand the set of IR control-flow constructs allowed, so long as we can still generate structure `Region`s from them, without having to mess with the emit logic (e.g., we could start to support multi-level `break` by introducing temporaries as needed). In the limit we can generate our `Region`s using something like the "Relooper" algorithm. * We can emit to other representations while retaining the same control-flow restructuring support. E.g., if we drop the structured information from the IR, then emitting to SPIR-V for Vulkan would require us to use the strucured control-flow information from these `Region`s. * We can do analysis that needs to understand `Region` structure. This is relevant to issue #569, which was what prompted me to start on this work. Now that we have a representation of the nesting of `Region`s, we can use it to reason about visibility of values between blocks. During development of this change I ran into a gotcha, in that I had been assuming each IR block would map to a single `Region`, forgetting that our current lowering of "continue clauses" in `for` loops leads to them being duplicated. The `Region` representation handles this by having a linked-list struct mapping IR blocks to the `SimpleRegion`s that represent them. I added a test case that includes a `for` loop with a continue clause that is reached along multiple paths just to make sure that we continue to support that case. The compiler output should not change as a result of this work; this is supposed to be a pure refactoring change. * Add a pass to resolve scoping issues in generated code Fixes #569 The basic problem arises because the structured control flow that we output in high-level HLSL/GLSL doesn't match the "scoping" rules of an SSA IR. In particular, SSA says that a value can be used in any block that is dominated by the definition, but in the presence of `break` and `continue` statements it is easy to construct cases where a block dominates something that is not in its scope for structured control flow. Consider: ```hlsl for(;;) { int a = xyz; if(a) { int b = a; break; } int c = a; } int d = b; ``` This program is invalid as HLSL, because the variable `b` is referenced outside of its scope, but if we look at the CFG for this function, it is clear that the block that computes `b` dominated the block that computes `d`. IR optimizations can easily create code like this, so we need to be ready for it. The previous change added an explicit `Region` structure to represent the structured control flow that we re-form out of the IR, and this change adds a pass that exploits the structuring information to detect cases like the above and introduce temporaries to fix the scoping issue. For example, the pass would change the earlier code block into something like: ```hlsl int tmp; for(;;) { int a = xyz; if(a) { int b = a; tmp = b; break; } int c = a; } int d = tmp; ``` That is, we introduce a new `tmp` variable at a scope "above" both the definition and use of `b`, and then we copy `b` into that temporary right where it is computed, and then use the temporary instead of the original `b` at the use site. A few details that came up during the implementation: * Downstream compilers may get confused by code like the above, and complain that `tmp` may be used before it is initialized, even though the very definition of dominators in a CFG means we don't have to worry about it. Still, I introduced some one-off code to initialize the temporaries just to silence spurious warnings coming from fxc. * We need to be careful not to apply this logic to "phi nodes" (the parameters of basic blocks) since they will already be turned into temporaries by the emit logic, and trying to introduce temporaries with this pass led to broken code (I still need to investigate why). It may be that a future version of this pass should also take the code out of SSA form, so that we can introduce both kinds of temporaries in a single pass (and maybe eliminate some unnecessary variables by doing basic register allocation). There is another transformation that could fix some issues of this kind, by moving code out of a structured control-flow construct and to the "join point" after it. For example, we could turn our loop from the start of this commit message into: ```hlsl for(;;) { int a = xyz; if(a) { break; } int c = a; } int b = a; int d = b; ``` Moving the definition of `b` to after the loop is possible because there is no way to get out of the loop without executing that code anyway. Now the scoping issue for `d`'s use of `b` has gone away, but of course we've introduced a *new* scoping issue for `a`, when it gets used by `b`. Adding a pass to re-arrange control flow like this could reduce the cases where we have to apply the current pass, but it wouldn't eliminate them entirely. That means such a pass can be deferred to future work. This change includes a test case the reproduces the original issue, so that we can confirm the fix works. 25 May 2018, 02:20:11 UTC
d7515c3 Fix Slang->GLSL translation for entry point with multiple `out` parameters (#573) Fixes #568 The problem occurs when an entry point declares multiple `out` parameters: ```hlsl void myVS( out float4 a : A, out float4 b : B ) { ... a = whatever; b = somethingElse; ... if(done) { return; // explicit return } ... // implicit return } ``` Slang translates code like this by introducing a GLSL global `out` parameter for each of `a` and `b`, rewriting the logic inside the entry point to use a local temporary instead of the real parameters, and then assigning from the locals to the globals at every `return` site: ```glsl out vec4 g_a; out vec4 g_b; void main() { // insertion location (1) vec4 t_a; vec4 t_b; ... t_a = whatever; t_b = somethingElse; ... if(done) { // insertion location(2) g_a = t_a; g_b = t_b; return; // explicit return } ... // insertion location (3) g_a = t_a; g_b = t_b; // implicit return } ``` Note that there are three different places (for this example) where code gets inserted to make the translation work. We insert declarations of local variables at the top of the function, and then insert the copy from the temporariesto the globals at each `return` site (implicit or explicit). The bug in this case was that the pass was setting the insertion location to (1) outside of the loop for parameters, so that when it was done with `a` and moved on to `b`, it would end up inseting the temporary `t_b` at the last location used (location (3) in this example), and this would result in invalid code, because `t_b` gets used before it is declared. This bug has been around for a while, but it has largely been masked by the fact that so few shaders use multiple `out` parameters, and also because Slang's SSA-ification pass would often be able to eliminate the local variable anyway, so that the bug never bites the user. The reason it surfaced now for a user shader was because we introduced `swizzledStore`, which currently inhibits SSA-ification, so that some temporaries that used to get eliminated are now retained so that they can break things. The fix in this case is small: we use the existing `IRBuilder` only for insertions at location (1) and construct a new builder on the fly for all the insertions at `return` sites. I have not included a test case yet, because our end-to-end Vulkan testing is not yet ready, so this may regress again in the future. 23 May 2018, 18:49:18 UTC
76652fa When outputing a vector type with a size of 1 in GLSL, it needs to be output as the underlying type. For example vector<float,1> should be output as float in GLSL. (#572) 23 May 2018, 16:35:43 UTC
10190da Handle structure initializers in IR type legalization (#567) Fixes #566 The basic problem here is that the front-end translates a structure initializer-list expression into a `makeStruct` instruction (with one argument per field), but the IR type legalization logic wasn't handling the case where a `makeStruct` is used to construct a struct value that needs to get split by legalization. The implementation is relatively straightforward, and like the other cases of instruction legalization for compound types, it follows the shape of the `LegalType`/`LegalVal` cases. The one interesting bit is that we need to be a bit careful and filter the single argument list for `makeStruct` into two in the case where we generate a "pair" type for something that has both "ordinary" and "special" (resource) fields. Luckily the `PairInfo` data that was generated by type legalization has exactly the information we need (by design). This change does not address several issues that could be handled in follow-on changes: * The `makeArray` instruction will face similar issues if it is applied to a type that requires legalization: we'd need to turn an array of `LegalVal`s into a bunch of distinct arrays. * The error message when we hit the unimplemented case here isn't great. Ideally we should provide the line number of the instruction that fails in an error message when legalization fails. This change tries to focus narrowly on the bug at hand, and leave these issues for later changes. 21 May 2018, 22:53:01 UTC
e2c2c22 Generate Visual Studio projects using Premake (#557) * Generate Visual Studio projects using Premake This change adds a `premake5.lua` file that allows us to generate our Visual Studio solution using Premake 5 (https://premake.github.io/). The existing Visual Studio solution/projects are now replaced with the Premake-generated ones, and project contributors will be expected to update these by running premake after adding/removing files. I have *not* changed the Linux `Makefile` build at all, because that file is also used for things like running our tests, so that clobbering it with a premake-generated `Makefile` would break our continuous testing. Hopefully future changes can switch to a generated `Makefile` and perhaps even add an XCode project as well. Notes: * The `build/slang-build.props` file is no longer needed/used, so it has been removed. * The `slang-eval-test` test fixture wasn't following our naming conventions for its directory path, so it was updated to streamline the Premake build configuration work. This required changes to the `Makefile` as well * Some seemingly unncessary preprocessor definitions that were specified for `core` and `slang-glslang` have been dropped. We will see if anything breaks from that. * Possible fixup for Premake vpath issue Premake's `vpath` feature seems to be nondeterministic about the order it applies filters (because Lua isn't deterministic about the order of entries in a key/value table), and as a result we can end up in a weird case where it decides that a `foo.cpp.h` file matches the `**.cpp` filter (I'm not sure why) before it tests against the `**.h` filter. This change uses an (undocumented) Premake facility to set `vpath` using a list of singleton tables, which seems to fix the order in which things get tested. * Remove support for "single-file" build of Slang The `hello` example was the only bit of code that uses the "single-file" way of building Slang, and this had already run up against limitations of the Visual Studio compilers in its Debug|x64 build. Rather than mess with Premake to make it pass through the `/bigobj` linker flag that is needed to work around the issue, it makes more sense just to stop using/supporting the feature since we wouldn't want users to depend on it anyway (our documentation no longer refers to it). While I was at it I went ahead and made sure that the `SLANG_DYNAMIC` flag doesn't need to be set manually, so that instead there is a non-default `SLANG_STATIC` option (not that we have a static-library build of Slang at the moment). 11 May 2018, 23:34:19 UTC
34ecdb7 Add tests for custom #error and #warning messages (#562) Resolves #310 The behavior was fixed in #484, but that change didn't add test cases to cover the new functionality. 11 May 2018, 22:05:12 UTC
5e604a6 Cleanups around behavior when the compiler fails (#553) * Cleanups around behavior when the compiler fails * Add another case where we try to `noteInternalErrorLoc()` if an exception in thrown. This one is the in the logic for emitting an IR instruciton. This could be improved by adding another layer at the function level (as a catch-all for instructions with no location), but something is better than nothing. * Change a bunch of `assert()`s over to `SLANG_ASSERT()`s, so that we can theoretically take more control over them (e.g., make release builds with asserts enabled) * Some other small cleanups around the assertions we perform. In the survey I made, I didn't really see many obvious "smoking gun" cases where we could produce a significantly better error message for some of the unimplemented/unexpected paths, other than to actually implement the missing functionality. * fixup 11 May 2018, 20:56:14 UTC
10c0ffa Add test for associated type from global generic parameter (#561) Resolves #357 The example shader from that issue has been added as a test case, and works with the top-of-tree Slang compiler (most likely due to the changes introduced with the IR-level type system). 11 May 2018, 16:15:04 UTC
2d96e1f Fixes #559 (#560) 11 May 2018, 14:51:11 UTC
b0413c1 Merge pull request #558 from tfoleyNV/bad-type-emit-workaround Workaround for cases where we emit illegal-but-unused types 11 May 2018, 05:58:05 UTC
4e07e22 Workaround for cases where we emit illegal-but-unused types This is a quick workaround to deal with cases where we try to emit an unreferenced IR type that contains references to pre-legalization types (which might have been removed from the IR even thought they are still referenced). The basic fix is to *not* add types to our global order of instructions to emit by default, and only add them on demand as they are referenced by other instructions. This is not a real fix for the underlying issue, which is that type legalization is only being applied to a subset of global instructions instead of all of them. A more detailed fix for that problem will need to be devised next. This fix also doesn't address the question of why an unreferenced `struct` type came to be present in the IR code passed to the back-end in the first place. It would be good to understand how this scenario is arising. 11 May 2018, 03:28:39 UTC
140e51e Feature/xunit (#555) * Remove serialization of screen captures from a renderer implementation, capture now writes to a Surface. Then client code can decide to serialize (or use as needed). * Improved comment for captureScreenSurface. * First pass support for xunit output. * Controlling output to improve xunit support. * Xml encoding and writing out of error/skip for xunit. * Fixes to make build on linux. * Fix typo for linux build. 08 May 2018, 22:17:29 UTC
b67f656 Re-enable emission of #line directives and clean up output (#554) This was based on feedback from Falcor users, who felt like changing the default to have no line directives didn't work out well. Since I'd only made them disabled by default based on what I perceived to be Falcor's needs, I'm happy to turn this back on by default. I also added a few changes to clean up the output: * Don't emit a directive for a sub-expression, since that breaks up the code too much. The only directives inside a function body will be on top-level instructions that didn't get folded into a use site. * Add logic to emit a directive for top-level declarations (globals, functions, structs), and clean up their printing so that they put any extra space *after* the declaration rather than before (so the line numbers can be accurate) * Don't emit the file path part of a directive if it would be the same as the previous directive. This makes the output less noisy, at the cost of having to work your way backward to find the file if you are looking directly at the output. There are certainly more cleanups possible, but these make the output decent enough to be useful for working backwards from a downstream compiler error to the offending code. 04 May 2018, 20:42:57 UTC
07a59b6 Allow more complex compound expressions when emitting from IR (#552) The emit logic already had an idea of when an instruction should be "folded" it its use site(s), and this change just expands on that logic to try to be more aggressive. The basic idea is that instead of outputting this: ```hlsl float4 _S3 = a_0 + b_0; float4 _S4 = c_0 * _S3; d_0 = _S4; ``` we can hopefully output something like this: ```hlsl d_0 = c_0 * (a_0 + b_0); ``` The way this works is that after dealing with the various special cases that decide an instruction `I` must/cannot be folded in, we look and see if it has the following properites: * `I` has no side effects * `I` has a single user, `U` * `I` and `U` are in the same block (and `I` comes before `U` in that block) * for every instruction `X` between `I` and `U` (exclusive), `X` has no side effects If all of these conditions are true, then `I` can be folded in as a sub-expression when we emit `U`. This change doesn't affect most of our test output, but there is still a single test with SPIR-V output that we compare against a GLSL baseline, and so that baseline had to be modified to match the GLSL we now generate. Similar to #547, this change is not meant to provide a complete solution, but rather to take a concrete but low-risk step toward improving our output. Opportunities to improve the results further include: * We can/should ensure that when outputting sub-expressions we keep extra parentheses to a minimum. The old logic for emitting from an AST had support for "unparsing" expressions with minimal parentheses, and we should try to do the same. This can be error-prone, because omitting parentheses can lead to silent failures, so it must be done carefully. * We could try to be more aggressive about detecting what operations might have side effects. The most interesting case is function calls, where we should try to check if the callee is a function known to be side-effect-free. We could start by annotating most builtin functions with an attribute/decoration that indicates freedom from side effects. Deriving this attribute for user functions could be interesting, but we'd have to be careful since "nontermination" is technically a side effect. * We could try to be more aggressive about determining what side effects in instructions `X` are "safe" for the instruction `I` to move across. For example, if `I` is a load from variable `a` and `X` is a store to variable `b`, then that would seem to be safe. This starts to get into issues of instruction scheduling, though, and that is probably beyond what we want Slang to be doing. 04 May 2018, 19:01:30 UTC
ee47232 Use Surface for screen capture in Renderer interface (#551) * Remove serialization of screen captures from a renderer implementation, capture now writes to a Surface. Then client code can decide to serialize (or use as needed). * Improved comment for captureScreenSurface. 04 May 2018, 16:00:53 UTC
494330d Add a pass for computing dominator trees (#541) This code is currently not used by anything, but I wanted to check in a first pass at an implementation of dominator tree construction so that we don't have to keep avoiding implementing algorithms that rely on having dominator information available. The algorithm used to construct the dominator tree is taken from "A Simple, Fast Dominance Algorithm" by Keith D. Cooper, Timothy J. Harvey, and Ken Kennedy. This is not the "best" algorithm in terms of asymptotic performance, but it is among the simplest algorithms for computing a dominator tree that still outperforms naive iterative set-based methods. The actual data structure and API for the dominator tree has a bit of "cleverness" in it to try to make the common queries reasonably fast (e.g., you can check whether A dominates B in constant time). My hope is that even if we implement a more advanced algorithm for constructing the dominator tree, we can retain compatibility with passes that might make use of this API. Because no code is currently using this logic, I have done only minimal testing by stepping through this code and validating the results on paper for some very small CFGs. More serious testing/debugging may need to wait until we have an optimization pass that needs the dominator tree we compute here. One open question I have is how best to introduce traditional unit testing into Slang, since this is an example of code that would benefit greatly from being unit tested. 04 May 2018, 00:40:26 UTC
00afea1 Pass through original names for most declarations (#547) The basic idea here is that when lowering to the IR, the front-end will attach a "name hint" to the IR instruction(s) that represent a given declaration, and then the passes that work on the IR will try to preserve and propagate those names, and then finally the emit logic will use them in place of mangled or unique names when available. This change does *not* try to deal with the issues that arise when we try to use those variable names in the output without any modification (e.g., handling cases where they might clash with keywords or builtins in the target language). Instead, it tries to establish baseline behavior for propagating through names, so that a later change can concentrate on the issue of using those names exactly when it is legal to do so. In order to avoid issues around the name "hints" causing problems we take two main steps: 1. We "scrub" each name to reduce it down to the allowed set of identifier characters in C-like languages, and then ensure that it doesn't do things that would be illegal in some downstream languages (e.g., consecutive underscores are not allowed in GLSL) or could clash with Slang's mangled names. This process isn't guaranteed to give distinct results for distinct inputs (it isn't a mangling scheme, after all). 2. We generate a unique ID for each occurence of a given name and always use that as a suffix. This means that even if a name happens to overlap with a keyword (if you somehow have a variable named `do`), we will still add a suffix that makes it not a problem (we'd output `do_0` which is fine). The logic for generating these names is mostly straightforward. For simple variables, we use their given name directly, while for other declarations we try to form a name that includes their parent declaration (e.g. `SomeType.someMethod`). Various IR passes need to propagate or preserve this information. The most interesting is type legalization, when we take a variable with an aggregate type and split some of the fields out into their own variables. In that case we generate "dotted" names like `someVar.someTexture` and rely on the emit logic to turn that into `someVar_someTexture`. During SSA generation, if we are promoting a variable to SSA temporaries, we will try to propagate the name of the variable over to the temporaries (unless they already have a name from some other place). The same applies to block parameters ("phi nodes"). Many of the test changes need their expected output to be updated for this change. Luckily in most cases the output has gotten easier to understand. 03 May 2018, 23:34:49 UTC
f847294 Added Surface type - as a simple value type to hold a 2d collection of pixels. (#548) Added PngSerializeUtil allows currently for just writing Surface of RGBA format. Removes dependency on stbi_image except for in PngSerializeUtil. Removed use of gWindowWidth/Height globals - pass the height into initialize or Renderer. 03 May 2018, 22:42:13 UTC
c216f00 Fixes based on review of vulkan-first-render PR #545 (#546) 03 May 2018, 21:17:05 UTC
367f3a7 Feature/vulkan first render (#545) * First pass at InputLayout for Vulkan Add support for RGBA_Float32 * Use VulkanModule and VulkanApi to handle accessing Vulkan types. * First pass at Vulkan swap chain/Device queue. * Added VulkanUtil for generic function functions. * Move more functionality to VulkanApi and VulkanUtil. Make Buffer able to initialize itself. * More tidy up around VulkanDeviceQueue * First pass use of VulkanDeviceQueue in VkRenderer * First pass use of VulkanSwapChain on VkRenderer * Added depth formats. Binding for constant and vertex buffers for Vulkan. * Setting up VkImageView on backbuffers. * First pass support for setting up vkRenderPass. * Fixes to work around Vulkan swap chain/verification issues. * Added support for Pipeline and a pipeline cache. * Working without waiting - because use of pipeline cache. * Added support for VkFramebuffer in Vulkan. * First pass at creating Vulkan graphics pipeline. * More efforts to get Vulkan to render. * Small improvement for checking of Binding flags. * Removed setConstantBuffers from the Renderer interface - so that all resource binding takes place through the BindingState. To make this work required a 'hack' in render-test main.cpp - so that the constant buffer binding that is needed in some tests is only added when it doesn't clash. * RendererID -> unified into RendererType. Added getRendererType to Renderer interface. Added ProjectionStyle, and function to get from RendererType. Added getIdentityProjection to RendererUtil - to get projection that is the 'identity' - but hits the same pixels for all projection styles. * Fix build problem on Win32 on Vulkan where should use VK_NULL_HANDLE. * Improve naming, comments. Remove dead code. * Remove unwanted comment. 03 May 2018, 18:25:13 UTC
7893549 Merge pull request #543 from csyonghe/master Speedup type checking using cached overload resolution results. 03 May 2018, 03:01:48 UTC
235d6aa Merge branch 'master' into master 03 May 2018, 00:39:10 UTC
0399d99 Speedup type checking using cached overload resolution results. This change adds caches to built-in operator overload resolution and type coersion to avoid running these time-consuming operations every time. - Adds `TypeCheckingCache` type, which is defined in check.cpp, that contains two dictionaries for the cached results of `ResolveInvoke` and `CanCoerce` calls. - Add `destroyTypeCheckingCache` and `getTypeCheckingCache` methods to `Session` class to reuse these cached results over the entire session. 02 May 2018, 22:20:32 UTC
384df86 Add support for "swizzled stores" (#544) This was a known issue in our IR representation, which was now biting a user. The basic problem is that in code like the following: ```hlsl RWStructureBuffer<float4> buffer; ... buffer[index].xz = value; ``` we ideally want to be able to reproduce the original HLSL code exactly, but that requires directly encoding the way that this code writes to two elements of a vector, but not the others. The currently lowering strategy we had produced IR something like: ```hlsl float4 tmp = buffer[index]; tmp.xz = value; buffer[index] = tmp; ``` That transformation might seem valid, but it has some big problems: * It generates UAV reads that are not needed, which could impact performance * It performs read-modify-write operations on memory that the programmer didn't explicitly write, which could create data races The fix here is somewhat obvious: if the "base" of a swizzle operation on a left-hand side resolves to a pointer in our IR, then we can output a "swizzled store" instead of the read-modify-write dance. We currently keep the read-modify-write around since it is potentially needed as a fallback in the general case. Along the way I also tried to make sure that we handle the case where we have a swizzle of a swizzle on the left-hand side: ```hlsl buffer[index].xz.y = value; ``` That code should behave the same as `buffer[index].z = value`. I am currently detecting and cleaning up this logic in the lowering path for `SwizzleExpr`, because that is the only place in the lowering logic that "swizzled l-values" currently get created. 02 May 2018, 21:44:13 UTC
60bcc68 Add support for explicit register space bindings (#542) This change adds support for specifying explicit register spaces, like: ```hlsl // Bind to texture register #2 in space #1 Texture2D t : register(t2, space1); ``` I added a test case to confirm that the register space is properly propagated through the Slang reflection API. This change also adds proper error messages for some error/unsupported cases that weren't being diagnosed: * Specifying a completely bogus register "class" (e.g., `register(bad99)`) * Failing to specify a register index (`register(u)`) * Specifying a component mask (`register(t0.x)`) * Using `packoffset` bindings I added test cases to cover all of these, as well as the new errors around support for register `space` bindings. In order to get the existing tests to pass, I had to remove explicit `packoffset` bindings from some DXSDK test shaders. None of these `packoffset` bindings were semantically significant (they matched what the compiler would do anyway, for both Slang and the standard HLSL compiler). Removing them is required for Slang now that we give an explicit error about our lack of `packoffset` support. In a future change we might add logic to either detect semantically insignificant `packoffset`s, or to just go ahead and support them properly (as a general feature on `struct` types). 02 May 2018, 18:40:09 UTC
d3c1c8b Fix emit logic when "terminators" occur in the middle of a block (#540) Fixes #527 There were a few problem cases for the IR emit logic. The most obvious, which came up in #527 is that a function body with multiple `return` statements would generate invalid code: ```hlsl int foo() { return 1; int x = 2; return x; } ``` In that case the IR for `foo` would have a single block that has two `return` instructions, which is invalid. Another case that seems to be arising more often, but that had less obvious consequences was when one arm of an `if` statement ends in a `return`: ```hlsl if(a) { return b; } else { int c = 0; } int d = 0; ``` In that case, the `return` instruction for `return b` would be followed by a branch to the end of the `if` (the `int d = 0;` line), because that would be the normal control flow without the early `return`. The fix implemented here is to have the IR lowering logic be a bit more careful on two fronts: 1. When emitting a branch, check if the block we are emitting into has already been terminated, and if so just don't emit the branch (since we are logically at an unreachable point in the CFG. 2. Whenever we are about to emit code for a (non-empty) statement, ensure that the current block being build is unterminated. If the current block is terminated, then start a new one. Case (2) will only matter when there is unreachable code (e.g., in the function `foo()`, the declaration of `x` and the second `return` can never be reached), so I added a warning in that case, and included a test case that triggers the new warning (with a function like `foo()` above). 02 May 2018, 13:45:35 UTC
d90d73a Diagnose attempts to write to fields in methods (#530) * Diagnose attempts to write to fields in methods Work on #529 This helps to avoid the case where a Slang user writes a struct with helpful `setter` methods, and finds that it doesn't work as expected because the `this` parameter is currently handled like an `in` parameter (passed by value, but mutable in the callee). Fixing this issue actually involved making a more broad fix to how l-value-ness is propagated. The existing checking logic was assuming that l-value-ness is just a property of a particular member declaration (e.g., a field is either mutable or not), and didn't take into account whether the "base expression" was mutable. This change fixes that oversight, which might lead to additional errors being issued if we aren't correctly making things mutable when we should. A `ThisExpr` was already immutable by default, so that part didn't actually need to change. Just propagating its immutability through was enough. As an additional assistance to users, I have added an extra diagnostic that triggers when a "destination of assignment is not an l-value" error occurs and the left-hand-side expression seems to be based on `this` (whether implicitly or explicitly). This will ideally help users to understand that the "setter" idiom is not yet supported. * Fixed setRadius typo 02 May 2018, 00:26:20 UTC
809f520 Cleanups (#539) * Cleanup: remove unused files from project * Cleanup: move IRModule forward declaration into correct namespace 01 May 2018, 21:32:03 UTC
3ace6e7 Remove unused local variable in vm.cpp (#533) Unused local variable prevents compiling when warnings are treated as errors 29 April 2018, 01:21:22 UTC
b54629f Fix for global generic parameter substitution (#512) The problem here arises when multiple entry points are compiled in one pass. Each entry point has its own arguments for global generic parameters, and leads to us emitting a `bindGlobalGenericParameter(p, val)`. But once the first entry point's substitutions are applied, the second entry point's code gives `bindGlobalGenericParameter(val, val)` and the compiler crashes (in debug builds) because `val` is not a global generic parameter. This change just applies a quick fix. If we see `bindGlobalGenericParameter(x,y)` during specialization, and `x` is not a global generic parameter, then we skip it. The right long-term fix is to change the compiler's representation of global generic arguments so that they live on a `CompileRequest` instead of an `EntryPointRequest`. That is a more significant change (with impact on the public API), so I'm inclined to leave it as a cleanup for another day (given that no customers are using global generic parameters today). 25 April 2018, 18:58:06 UTC
9a7849d Improve SSA promotion for arrays and structs (#521) * Improve SSA promotion for arrays and structs Fixes #518 The existing SSA pass would only handle `load(v)` and `store(v,...)` where `v` is the variable instruction, and would bail out if `v` was used as an operand in any other fashion. The new pass adds support for `load(ac)` where `ac` is an "access chain" with a gramar like: ac :: v | getElementPtr(ac, ...) | getFieldAddress(ac, ...) What this means in practical terms is that we can promote a local variable of array or structure type to an SSA temporary even if there are loads of individual elements/fields, as along as any *assignment* to the variable assigns the whole thing. I've added a test case to confirm that this change fixes passing of arrays as function parameters for Vulkan. * Fixup: disable test on Vulkan because render-test isn't ready This is a fix for Vulkan, but I don't think our testing setup is ready for it. * Fixup: error in unreachable return case, caught by clang * Fixups based on testing These are fixes found when testing the original changes against the user code that originated the bug report. * `emit.cpp`: Make sure to handle array-of-texture types when deciding whether to declare a temporary as a local variable in GLSL output * `ir-legalize-types.cpp`: Make a not of a source of validation failures that we need to clean up sooner or later (just not in scope for this bug fix change). * `ir-ssa.cpp`: * When checking if something is an access chain with a promotable var at the end, make sure the recursive case recurses into the "access chain" logic instead of the leaf case * Add some assertions to guard the assumption that any access chain we apply has been scheduled for removal * Correctly emit an element *extract* instead of getting an element *address* when promoting an element access into an array being promoted * Eliminate a wrapper routine that was setting up an `IRBuilder` and use the one from the block being processed in the SSA pass (since it was set up for stuff just like this) * `ir-validate.cpp` * Add a hack to avoid validation failures when running IR validation on the stdlib code. This case triggers for an initializer (`__init`) declaration inside an interface, since the logical "return type" is the interface type itself, which has no representation at the IR level and thus yields a null result type in a `FuncType` instruction. 23 April 2018, 17:37:56 UTC
627de1c Fix successor computation for `switch` instruction (#520) Fixes #519 The code was leaving out the `default` label from the successor list, which would break any passes that require an accurate CFG (with the big one right now being the SSA-formation pass). 23 April 2018, 17:37:24 UTC
163d306 Better diagnostics when compilation is aborted (#517) * Improve messages when compilation is aborted. Make sure to include the information from any `Slang::Exception` that was thrown, so that the poor user can at least point us at our own message string from an assertion failure. This doesn't provide them line-number information in their code or the Slang codebase, so there is still work to be done in making the compiler more friendly about this stuff. * When aborting compilation, try to note what source location we were working on This is handled by having exception handlers on the stack at key bottleneck points in semantic checking and IR generation, which can then emit a diagnostic to note what we were working on when things failed. This is not intended to be an indiciation to the user that their code is at fault for a compiler crash (it is always our fault), but might give them a chance to work around whatever bug is blocking them. 21 April 2018, 00:54:39 UTC
2f782d4 Diagnose use of an implicit cast as an argument for an `out` parameter (#516) Work on #499 Two big fixes here: * The logic for checking constraints on `out` arguments wasn't actually triggering because it relied on function parameters being given an `OutType` if they are marked `out`, but the code wasn't actually doing that. Fixing the computation of types for functions resolved that issue. * Next, I added a specific diagnostic to follow up the "expected an l-value" error to let the user know that their argument was implicitly converted, and that is why it doesn't count as an l-value in Slang's rules. I've added a test case to ensure that we retain this diagnostic until we can do a true fix for the issue. The right long-term fix is to have an AST representation of all the implicit casts involved (e.g., in both directions for an `inout` parameter), and then have the IR generate explicit code for the conversions in each direction (the `LoweredVal` representation can handle this sort of thing). 20 April 2018, 23:56:33 UTC
c73ccbc Fixes/improvements based on feature/render-binding-resource (#511) * Dx12 rendering works in test framework. * Turn on dx12 render tests. * Split out functions for construction or Renderer types into ShaderRendererUtil. Removed the serialization of buffers code into test-render * Improvements in documentation and typename in BindingState types. RegisterSet -> CompactBindIndexSlice RegisterList -> BindIndexSlice RegisterDesc -> ShaderBindSet * Fix debug build break. 20 April 2018, 18:59:17 UTC
4c751df Separation of Binding/Resource construction on Renderer interface (#508) * Dx12 rendering works in test framework. * Turn on dx12 render tests. * First pass at Resource and TextureResource/BufferResource types. * Fix bug in Dx11 impl for BufferResource. * Dx12 supports TextureResource and binds using TextureResource type, and all tests pass. * Added TextureBuffer::Size type to make handling mips a little simpler. * Small improvements to Dx12 constant buffer binding Removed k prefix on an enum * First pass impl of dx11 createTextureResource Added setDefaults to TextureResource::Desc and BufferResource::Desc to simplify setup accessFlags -> cpuAccessFlags desc -> srcDesc * Split out generateTextureResource - can produce the texture using createTextureResource on the Renderer. * Added support for read mapping to Dx11 accessFlags -> cpuAccessFlags First pass at using TextureResource/BufferResource on Dx11 Some tests fail with this checkin * TextureResource working on all tests on dx11. * Construct ResourceBuffers on Dx11 and Dx12 using utility function createInputBufferResource. * First pass at OpenGl TextureResource * Small fixes to dx12 and dx11 setup. Gl working working using BufferResource and TextureResource * Tidy up around the compareSampler - looks like the previous test was incorrect. * Small documentation /naming improvements. * Fix some more small documentation issues. * First pass testing out construction of binding resources external to Renderer implementation. * Moved some BindingState::Desc types to BindingState to make easier to use. * First pass of binding using BindingState::Desc for Dx11. * First pass at binding with dx12. * Fixed issues around separating dx12 binding from ShaderInputLayout * First pass at OpenGl state binding. * BindingState::Desc::Binding::Type -> BindingType * Use Buffer to manage life of vk resources. Construction of buffers handled by createBufferResource (BindingState doesn't have specialized logic) * Remove InputLayout types from binding so can create a binding independent of it. * Added upload buffer to BufferResource - could be used for write mapping. * m_samplers -> m_samplerDescs. First pass at Vk binding with BindingState::Desc. Small tidy/doc improvements. * First pass with binding all taking place through BindingState::Desc. All tests pass. * Removed support for creating BindingState from ShaderInputLayout * Remove serializeOutput from Renderer interface and all implementations. Implement map/unmap on vulkan Implement serializeBindingOutput which uses map/unmap and BindingState::Desc to write result. * Make implementation of BindingState use the BindingState::Desc for much of state - only hold api specific in BindingDetail per implementation. * Use Glsl binding on vulkan (was using hlsl). * BindingState::Desc::Binding -> BindingState::Binding. Made possible by impls using 'BindingDetail' for their specific needs. * Fix compile problems on win32. * Fix a typo in name createBindingSetDesc -> createBindingStateDesc 19 April 2018, 21:47:04 UTC
cbedf01 Fix GS cross-compilation after IR type system change (#507) The cross-compilation logic for geometry shaders would look through the user's entry point for calls like `someStream.Emit<X>(val)` and turn that into `outputGlobals = val; EmitVertex();`. It was recognizing the `Emit()` calls by looking at the callee in all `call` instructions and seeing if it was registered to lower to GLSL as `EmitVertex()`. The logic was try to look "through" `specialize` instructions (to deal with the `<X>` bit in the call above), but this wasn't updated for the new IR encoding where the first operand to a `specialize` is the generic being specialized, and not the function nested inside it. The fix here is to properly look through both `specialize` instructions and generics. This is kind of a gross operation and we've done things like it in a few places, so it might be something we try to extract into a utility function in the future. 19 April 2018, 19:22:20 UTC
163bf58 Add type legalization support for "field extract" op (#501) The code was handling the "get field address" opcode (which takes a pointer to a struct and returns a pointer to a field), but didn't have a case for values. This was just an oversight. 19 April 2018, 16:45:40 UTC
c68c6fa Fix up DXR type emission from IR type system (#498) * There was a simple typo where we were emitting `RaytracingAccelerationStructureType` instead of `RaytracingAccelerationStructure` * The IR lowering logic was failing to handle types with an `__intrinsic_type` modifier (which maps them to a single IR opcode) that weren't in one of the various special cases. I added a catch-all case to the handling of `DeclRefType`. This notably affected the `RayDesc` type. * Even if we lower `RayDesc` to an intrinsic type, we still need to lower its *fields* too, and these were getting emitted with mangled names (as would happen for any user-defined fields). The solution I implemented was to allow for fields to have `__target_intrinsic` modifiers in the stdlib, to specify the un-mangled name they should use on each target. I'm not 100% happy with this solution, because it seems odd to have `RayDesc` be an intrinsic type, but then to also have field keys used in `getField` instructions as if it were an ordinary `struct`. It seems like a better solution would be to have it lower to an IR `struct`, just with an appropriate modifier. 19 April 2018, 16:34:54 UTC
17fa424 Fix output of `groupshared` with IR type system (#492) The basic problem was that the lowering logic was constructing (more or less) `Ptr<@GroupShared X>` instead of `@GroupShared Ptr<X>`. There were also problems with passes not propagating through rates that should have been (e.g., legalization). I've added a test case to actually validate `groupshared` support. 19 April 2018, 00:22:44 UTC
c3a27c0 Fix up name mangling/unmangling for extensions (#493) * Fix up name mangling/unmangling for extensions This is required for the unmangling we do on some builtin function names. The work here is mostly just a band-aid, and a more comprehensive pass over the name mangling/unmangling code is required to make any of this robust. * fixup: UNREACHABLE_RETURN argument 18 April 2018, 21:14:26 UTC
0450ca6 Fix some logic around legalization of sampler types (#496) The main error here was checking for `IRSamplerType` instead of `IRSamplerTypeBase`, which means the relevant logic only triggered for the `SamplerState` type and not the `SamplerComparisonState` type. The two affected places were type legalization (so that comparison samplers in `struct` types weren't being hoisted out) and the emit logic when deciding whether to introduce local temporaries (so we were emitting temporaries for comparison samplers, leading to GLSL errors). 18 April 2018, 19:57:34 UTC
00389a1 Feature/renderer binding (#489) * Dx12 rendering works in test framework. * Turn on dx12 render tests. * First pass at Resource and TextureResource/BufferResource types. * Fix bug in Dx11 impl for BufferResource. * Dx12 supports TextureResource and binds using TextureResource type, and all tests pass. * Added TextureBuffer::Size type to make handling mips a little simpler. * Small improvements to Dx12 constant buffer binding Removed k prefix on an enum * First pass impl of dx11 createTextureResource Added setDefaults to TextureResource::Desc and BufferResource::Desc to simplify setup accessFlags -> cpuAccessFlags desc -> srcDesc * Split out generateTextureResource - can produce the texture using createTextureResource on the Renderer. * Added support for read mapping to Dx11 accessFlags -> cpuAccessFlags First pass at using TextureResource/BufferResource on Dx11 Some tests fail with this checkin * TextureResource working on all tests on dx11. * Construct ResourceBuffers on Dx11 and Dx12 using utility function createInputBufferResource. * First pass at OpenGl TextureResource * Small fixes to dx12 and dx11 setup. Gl working working using BufferResource and TextureResource * Tidy up around the compareSampler - looks like the previous test was incorrect. * Small documentation /naming improvements. * Fix some more small documentation issues. 17 April 2018, 20:59:03 UTC
15bff91 Propagate diagnostics when imported module has errors (#485) A previous fix avoided crashes when an `import`ed module has errors by making the "failed to import" error a fatal one. Unfortunately, the code path that handles fatal errors was failing to copy diagnostic output from the sink over to the member variable on the `CompileRequest` that exposes the output through the API. This meant that API users lost all context on error messages in `import`ed code. This change fixes the immediate issue by plumbing through the error output, but doesn't fix the more fundamental issue: the front-end should not crash when an `import` fails, by any means. 13 April 2018, 20:36:26 UTC
021a492 Preprocessor cleanups (#484) * For a `#error` or `#warning`, read the rest of the line as raw text to include in the error message * When skipping tokens (e.g., in an `#ifdef`d out block), don't emit errors on invalid characters * TODO: we could clearly get more efficient and skip whole raw lines in the future * Fix an issue when a macro invocation that expands to nothing (zero tokens) is the last thing before a directive. The preprocessor was returning the `#` as an ordinary token, because it has already gone past its test for directives. 13 April 2018, 00:08:52 UTC
baf194e Introduce an IR-level type system (#481) * Introduce an IR-level type system Up to this point, the Slang IR has used the front-end type system to represent types in the IR. As a result (but ultimately more importantly) the IR representation of generics and specialization has used AST-level concepts embedded in the IR. For example, to express the specialization of `vector<T,N>` to a concrete type `float` for `T`, we needed an IR operation that could represent the specialization, with operands that somehow represented the type argument `float`. The whole thing was very complicated. The big idea of this change is to introduce a new representation in which types in the IR are just ordinary instructions, so that using them as operands makes sense. The hierarchy of IR types closely mirrors the AST-side hierarchy for now, and that will probably be something we should maintain going forward. In order to make these changes work, though, I also had to do major overhauls of things like the way substitutions are performed, how we check interface conformances, the way lookup through interface types is done, etc. etc. This is a big change, and unfortunately any attempt to summarize it in the commit message wouldn't do it justice. * Fix 64-bit build warning * Fix up some clang warnings/errors 11 April 2018, 23:18:29 UTC
6322983 Feature/dx12 compute (#483) * Dx12 rendering works in test framework. * Turn on dx12 render tests. * Getting simpler dx12 compute tests to work. * With expected data in test - check for specialized and then for the default, so that multiple test can share the same expected data, but specialized cases can still be set. * Fixed construction and binding on dx12 textures. * Control which render apis used in test from command line. * Small aesthetic fixes in render-test/main.cpp. * Fix binding problem for uavs/srvs dx12. Previously tried to create srv/uav for StorageBuffers (like dx11 does), but the binding breaks as you can end up with two srvs using the same register. First pass at fixing problems with Texture creation for dx12 - assertions were hit with 3d or array textures. * Fixes to improve Dx12 setup shader resource views for cubemaps/arrays. * Fixed d3d12 textureSamplingTest - problem was that cubemap/array textures were not being uploaded correctly. * Changed the order of how binding of constant buffers (as just set on the Renderer) indexes. Previously they were given the lowest indices, but they clashed with the indices from the 'Binding'. Changing this means all tests run on d3d12. * Add code to allow use of warp (although not command line switchable yet). Fix problem setting up raw UAV - as identified by warp. * Added RenderApiUtil - which can detect if a render api is potentially available. * Moved render flag testing/parsing into RenderApiUtil. * Fix signed/unsigned warning. * Fixes around enums prefixed with k on the review of feature/dx12 compute branch. * Remove explicit -dx12 line in tests, as all can currently be generated from dx11 tests. 11 April 2018, 19:55:44 UTC
back to top