Revision history - None - origin: https://github.com/shader-slang/slang

visit type:

Revision	Author	Date	Message	Commit Date
9a7849d	Tim Foley	23 April 2018, 17:37:56 UTC	Improve SSA promotion for arrays and structs (#521) * Improve SSA promotion for arrays and structs Fixes #518 The existing SSA pass would only handle `load(v)` and `store(v,...)` where `v` is the variable instruction, and would bail out if `v` was used as an operand in any other fashion. The new pass adds support for `load(ac)` where `ac` is an "access chain" with a gramar like: ac :: v \| getElementPtr(ac, ...) \| getFieldAddress(ac, ...) What this means in practical terms is that we can promote a local variable of array or structure type to an SSA temporary even if there are loads of individual elements/fields, as along as any assignment to the variable assigns the whole thing. I've added a test case to confirm that this change fixes passing of arrays as function parameters for Vulkan. * Fixup: disable test on Vulkan because render-test isn't ready This is a fix for Vulkan, but I don't think our testing setup is ready for it. * Fixup: error in unreachable return case, caught by clang * Fixups based on testing These are fixes found when testing the original changes against the user code that originated the bug report. * `emit.cpp`: Make sure to handle array-of-texture types when deciding whether to declare a temporary as a local variable in GLSL output * `ir-legalize-types.cpp`: Make a not of a source of validation failures that we need to clean up sooner or later (just not in scope for this bug fix change). * `ir-ssa.cpp`: * When checking if something is an access chain with a promotable var at the end, make sure the recursive case recurses into the "access chain" logic instead of the leaf case * Add some assertions to guard the assumption that any access chain we apply has been scheduled for removal * Correctly emit an element extract instead of getting an element address when promoting an element access into an array being promoted * Eliminate a wrapper routine that was setting up an `IRBuilder` and use the one from the block being processed in the SSA pass (since it was set up for stuff just like this) * `ir-validate.cpp` * Add a hack to avoid validation failures when running IR validation on the stdlib code. This case triggers for an initializer (`__init`) declaration inside an interface, since the logical "return type" is the interface type itself, which has no representation at the IR level and thus yields a null result type in a `FuncType` instruction.	23 April 2018, 17:37:56 UTC
627de1c	Tim Foley	23 April 2018, 17:37:24 UTC	Fix successor computation for `switch` instruction (#520) Fixes #519 The code was leaving out the `default` label from the successor list, which would break any passes that require an accurate CFG (with the big one right now being the SSA-formation pass).	23 April 2018, 17:37:24 UTC
163d306	Tim Foley	21 April 2018, 00:54:39 UTC	Better diagnostics when compilation is aborted (#517) * Improve messages when compilation is aborted. Make sure to include the information from any `Slang::Exception` that was thrown, so that the poor user can at least point us at our own message string from an assertion failure. This doesn't provide them line-number information in their code or the Slang codebase, so there is still work to be done in making the compiler more friendly about this stuff. * When aborting compilation, try to note what source location we were working on This is handled by having exception handlers on the stack at key bottleneck points in semantic checking and IR generation, which can then emit a diagnostic to note what we were working on when things failed. This is not intended to be an indiciation to the user that their code is at fault for a compiler crash (it is always our fault), but might give them a chance to work around whatever bug is blocking them.	21 April 2018, 00:54:39 UTC
2f782d4	Tim Foley	20 April 2018, 23:56:33 UTC	Diagnose use of an implicit cast as an argument for an `out` parameter (#516) Work on #499 Two big fixes here: * The logic for checking constraints on `out` arguments wasn't actually triggering because it relied on function parameters being given an `OutType` if they are marked `out`, but the code wasn't actually doing that. Fixing the computation of types for functions resolved that issue. * Next, I added a specific diagnostic to follow up the "expected an l-value" error to let the user know that their argument was implicitly converted, and that is why it doesn't count as an l-value in Slang's rules. I've added a test case to ensure that we retain this diagnostic until we can do a true fix for the issue. The right long-term fix is to have an AST representation of all the implicit casts involved (e.g., in both directions for an `inout` parameter), and then have the IR generate explicit code for the conversions in each direction (the `LoweredVal` representation can handle this sort of thing).	20 April 2018, 23:56:33 UTC
c73ccbc	jsmall-nvidia	20 April 2018, 18:59:17 UTC	Fixes/improvements based on feature/render-binding-resource (#511) * Dx12 rendering works in test framework. * Turn on dx12 render tests. * Split out functions for construction or Renderer types into ShaderRendererUtil. Removed the serialization of buffers code into test-render * Improvements in documentation and typename in BindingState types. RegisterSet -> CompactBindIndexSlice RegisterList -> BindIndexSlice RegisterDesc -> ShaderBindSet * Fix debug build break.	20 April 2018, 18:59:17 UTC
4c751df	jsmall-nvidia	19 April 2018, 21:47:04 UTC	Separation of Binding/Resource construction on Renderer interface (#508) * Dx12 rendering works in test framework. * Turn on dx12 render tests. * First pass at Resource and TextureResource/BufferResource types. * Fix bug in Dx11 impl for BufferResource. * Dx12 supports TextureResource and binds using TextureResource type, and all tests pass. * Added TextureBuffer::Size type to make handling mips a little simpler. * Small improvements to Dx12 constant buffer binding Removed k prefix on an enum * First pass impl of dx11 createTextureResource Added setDefaults to TextureResource::Desc and BufferResource::Desc to simplify setup accessFlags -> cpuAccessFlags desc -> srcDesc * Split out generateTextureResource - can produce the texture using createTextureResource on the Renderer. * Added support for read mapping to Dx11 accessFlags -> cpuAccessFlags First pass at using TextureResource/BufferResource on Dx11 Some tests fail with this checkin * TextureResource working on all tests on dx11. * Construct ResourceBuffers on Dx11 and Dx12 using utility function createInputBufferResource. * First pass at OpenGl TextureResource * Small fixes to dx12 and dx11 setup. Gl working working using BufferResource and TextureResource * Tidy up around the compareSampler - looks like the previous test was incorrect. * Small documentation /naming improvements. * Fix some more small documentation issues. * First pass testing out construction of binding resources external to Renderer implementation. * Moved some BindingState::Desc types to BindingState to make easier to use. * First pass of binding using BindingState::Desc for Dx11. * First pass at binding with dx12. * Fixed issues around separating dx12 binding from ShaderInputLayout * First pass at OpenGl state binding. * BindingState::Desc::Binding::Type -> BindingType * Use Buffer to manage life of vk resources. Construction of buffers handled by createBufferResource (BindingState doesn't have specialized logic) * Remove InputLayout types from binding so can create a binding independent of it. * Added upload buffer to BufferResource - could be used for write mapping. * m_samplers -> m_samplerDescs. First pass at Vk binding with BindingState::Desc. Small tidy/doc improvements. * First pass with binding all taking place through BindingState::Desc. All tests pass. * Removed support for creating BindingState from ShaderInputLayout * Remove serializeOutput from Renderer interface and all implementations. Implement map/unmap on vulkan Implement serializeBindingOutput which uses map/unmap and BindingState::Desc to write result. * Make implementation of BindingState use the BindingState::Desc for much of state - only hold api specific in BindingDetail per implementation. * Use Glsl binding on vulkan (was using hlsl). * BindingState::Desc::Binding -> BindingState::Binding. Made possible by impls using 'BindingDetail' for their specific needs. * Fix compile problems on win32. * Fix a typo in name createBindingSetDesc -> createBindingStateDesc	19 April 2018, 21:47:04 UTC
cbedf01	Tim Foley	19 April 2018, 19:22:20 UTC	Fix GS cross-compilation after IR type system change (#507) The cross-compilation logic for geometry shaders would look through the user's entry point for calls like `someStream.Emit<X>(val)` and turn that into `outputGlobals = val; EmitVertex();`. It was recognizing the `Emit()` calls by looking at the callee in all `call` instructions and seeing if it was registered to lower to GLSL as `EmitVertex()`. The logic was try to look "through" `specialize` instructions (to deal with the `<X>` bit in the call above), but this wasn't updated for the new IR encoding where the first operand to a `specialize` is the generic being specialized, and not the function nested inside it. The fix here is to properly look through both `specialize` instructions and generics. This is kind of a gross operation and we've done things like it in a few places, so it might be something we try to extract into a utility function in the future.	19 April 2018, 19:22:20 UTC
163bf58	Tim Foley	19 April 2018, 16:45:40 UTC	Add type legalization support for "field extract" op (#501) The code was handling the "get field address" opcode (which takes a pointer to a struct and returns a pointer to a field), but didn't have a case for values. This was just an oversight.	19 April 2018, 16:45:40 UTC
c68c6fa	Tim Foley	19 April 2018, 16:34:54 UTC	Fix up DXR type emission from IR type system (#498) * There was a simple typo where we were emitting `RaytracingAccelerationStructureType` instead of `RaytracingAccelerationStructure` * The IR lowering logic was failing to handle types with an `__intrinsic_type` modifier (which maps them to a single IR opcode) that weren't in one of the various special cases. I added a catch-all case to the handling of `DeclRefType`. This notably affected the `RayDesc` type. * Even if we lower `RayDesc` to an intrinsic type, we still need to lower its fields too, and these were getting emitted with mangled names (as would happen for any user-defined fields). The solution I implemented was to allow for fields to have `__target_intrinsic` modifiers in the stdlib, to specify the un-mangled name they should use on each target. I'm not 100% happy with this solution, because it seems odd to have `RayDesc` be an intrinsic type, but then to also have field keys used in `getField` instructions as if it were an ordinary `struct`. It seems like a better solution would be to have it lower to an IR `struct`, just with an appropriate modifier.	19 April 2018, 16:34:54 UTC
17fa424	Tim Foley	19 April 2018, 00:22:44 UTC	Fix output of `groupshared` with IR type system (#492) The basic problem was that the lowering logic was constructing (more or less) `Ptr<@GroupShared X>` instead of `@GroupShared Ptr<X>`. There were also problems with passes not propagating through rates that should have been (e.g., legalization). I've added a test case to actually validate `groupshared` support.	19 April 2018, 00:22:44 UTC
c3a27c0	Tim Foley	18 April 2018, 21:14:26 UTC	Fix up name mangling/unmangling for extensions (#493) * Fix up name mangling/unmangling for extensions This is required for the unmangling we do on some builtin function names. The work here is mostly just a band-aid, and a more comprehensive pass over the name mangling/unmangling code is required to make any of this robust. * fixup: UNREACHABLE_RETURN argument	18 April 2018, 21:14:26 UTC
0450ca6	Tim Foley	18 April 2018, 19:57:34 UTC	Fix some logic around legalization of sampler types (#496) The main error here was checking for `IRSamplerType` instead of `IRSamplerTypeBase`, which means the relevant logic only triggered for the `SamplerState` type and not the `SamplerComparisonState` type. The two affected places were type legalization (so that comparison samplers in `struct` types weren't being hoisted out) and the emit logic when deciding whether to introduce local temporaries (so we were emitting temporaries for comparison samplers, leading to GLSL errors).	18 April 2018, 19:57:34 UTC
00389a1	jsmall-nvidia	17 April 2018, 20:59:03 UTC	Feature/renderer binding (#489) * Dx12 rendering works in test framework. * Turn on dx12 render tests. * First pass at Resource and TextureResource/BufferResource types. * Fix bug in Dx11 impl for BufferResource. * Dx12 supports TextureResource and binds using TextureResource type, and all tests pass. * Added TextureBuffer::Size type to make handling mips a little simpler. * Small improvements to Dx12 constant buffer binding Removed k prefix on an enum * First pass impl of dx11 createTextureResource Added setDefaults to TextureResource::Desc and BufferResource::Desc to simplify setup accessFlags -> cpuAccessFlags desc -> srcDesc * Split out generateTextureResource - can produce the texture using createTextureResource on the Renderer. * Added support for read mapping to Dx11 accessFlags -> cpuAccessFlags First pass at using TextureResource/BufferResource on Dx11 Some tests fail with this checkin * TextureResource working on all tests on dx11. * Construct ResourceBuffers on Dx11 and Dx12 using utility function createInputBufferResource. * First pass at OpenGl TextureResource * Small fixes to dx12 and dx11 setup. Gl working working using BufferResource and TextureResource * Tidy up around the compareSampler - looks like the previous test was incorrect. * Small documentation /naming improvements. * Fix some more small documentation issues.	17 April 2018, 20:59:03 UTC
15bff91	Tim Foley	13 April 2018, 20:36:26 UTC	Propagate diagnostics when imported module has errors (#485) A previous fix avoided crashes when an `import`ed module has errors by making the "failed to import" error a fatal one. Unfortunately, the code path that handles fatal errors was failing to copy diagnostic output from the sink over to the member variable on the `CompileRequest` that exposes the output through the API. This meant that API users lost all context on error messages in `import`ed code. This change fixes the immediate issue by plumbing through the error output, but doesn't fix the more fundamental issue: the front-end should not crash when an `import` fails, by any means.	13 April 2018, 20:36:26 UTC
021a492	Tim Foley	13 April 2018, 00:08:52 UTC	Preprocessor cleanups (#484) * For a `#error` or `#warning`, read the rest of the line as raw text to include in the error message * When skipping tokens (e.g., in an `#ifdef`d out block), don't emit errors on invalid characters * TODO: we could clearly get more efficient and skip whole raw lines in the future * Fix an issue when a macro invocation that expands to nothing (zero tokens) is the last thing before a directive. The preprocessor was returning the `#` as an ordinary token, because it has already gone past its test for directives.	13 April 2018, 00:08:52 UTC
baf194e	Tim Foley	11 April 2018, 23:18:29 UTC	Introduce an IR-level type system (#481) * Introduce an IR-level type system Up to this point, the Slang IR has used the front-end type system to represent types in the IR. As a result (but ultimately more importantly) the IR representation of generics and specialization has used AST-level concepts embedded in the IR. For example, to express the specialization of `vector<T,N>` to a concrete type `float` for `T`, we needed an IR operation that could represent the specialization, with operands that somehow represented the type argument `float`. The whole thing was very complicated. The big idea of this change is to introduce a new representation in which types in the IR are just ordinary instructions, so that using them as operands makes sense. The hierarchy of IR types closely mirrors the AST-side hierarchy for now, and that will probably be something we should maintain going forward. In order to make these changes work, though, I also had to do major overhauls of things like the way substitutions are performed, how we check interface conformances, the way lookup through interface types is done, etc. etc. This is a big change, and unfortunately any attempt to summarize it in the commit message wouldn't do it justice. * Fix 64-bit build warning * Fix up some clang warnings/errors	11 April 2018, 23:18:29 UTC
6322983	jsmall-nvidia	11 April 2018, 19:55:44 UTC	Feature/dx12 compute (#483) * Dx12 rendering works in test framework. * Turn on dx12 render tests. * Getting simpler dx12 compute tests to work. * With expected data in test - check for specialized and then for the default, so that multiple test can share the same expected data, but specialized cases can still be set. * Fixed construction and binding on dx12 textures. * Control which render apis used in test from command line. * Small aesthetic fixes in render-test/main.cpp. * Fix binding problem for uavs/srvs dx12. Previously tried to create srv/uav for StorageBuffers (like dx11 does), but the binding breaks as you can end up with two srvs using the same register. First pass at fixing problems with Texture creation for dx12 - assertions were hit with 3d or array textures. * Fixes to improve Dx12 setup shader resource views for cubemaps/arrays. * Fixed d3d12 textureSamplingTest - problem was that cubemap/array textures were not being uploaded correctly. * Changed the order of how binding of constant buffers (as just set on the Renderer) indexes. Previously they were given the lowest indices, but they clashed with the indices from the 'Binding'. Changing this means all tests run on d3d12. * Add code to allow use of warp (although not command line switchable yet). Fix problem setting up raw UAV - as identified by warp. * Added RenderApiUtil - which can detect if a render api is potentially available. * Moved render flag testing/parsing into RenderApiUtil. * Fix signed/unsigned warning. * Fixes around enums prefixed with k on the review of feature/dx12 compute branch. * Remove explicit -dx12 line in tests, as all can currently be generated from dx11 tests.	11 April 2018, 19:55:44 UTC
c4004b3	jsmall-nvidia	10 April 2018, 21:53:03 UTC	Feature/dx12 compute (#482) * Dx12 rendering works in test framework. * Turn on dx12 render tests. * Getting simpler dx12 compute tests to work. * With expected data in test - check for specialized and then for the default, so that multiple test can share the same expected data, but specialized cases can still be set. * Fixed construction and binding on dx12 textures. * Control which render apis used in test from command line. * Small aesthetic fixes in render-test/main.cpp. * Fix binding problem for uavs/srvs dx12. Previously tried to create srv/uav for StorageBuffers (like dx11 does), but the binding breaks as you can end up with two srvs using the same register. First pass at fixing problems with Texture creation for dx12 - assertions were hit with 3d or array textures. * Fixes to improve Dx12 setup shader resource views for cubemaps/arrays. * Fixed d3d12 textureSamplingTest - problem was that cubemap/array textures were not being uploaded correctly. * Changed the order of how binding of constant buffers (as just set on the Renderer) indexes. Previously they were given the lowest indices, but they clashed with the indices from the 'Binding'. Changing this means all tests run on d3d12. * Add code to allow use of warp (although not command line switchable yet). Fix problem setting up raw UAV - as identified by warp. * Added RenderApiUtil - which can detect if a render api is potentially available. * Moved render flag testing/parsing into RenderApiUtil. * Fix signed/unsigned warning. * Fixes around enums prefixed with k on the review of feature/dx12 compute branch.	10 April 2018, 21:53:03 UTC
5298ccf	Tim Foley	05 April 2018, 23:15:45 UTC	Falcor fixes (#479) * Don't emit interpolation modifiers on GLSL fields The previous change that started passing through interpolation modifiers didn't account for the fact that the GLSL grammar doesn't allow interpolation modifiers on `struct` fields, so we shouldn't emit them in that case (and our legalization strategy for GLSL guarantees that varying input will never use a `struct` type anyway). * Try to handle SV_Position semantic on GS input HLSL allows `SV` semantics to be used for ordinary inter-stage dataflow in some cases (e.g., a VS can output `SV_Position` and it can then be read from a GS). GLSL allows similar things with `gl_Position`, but there are some wrinkles. One fix here is to correctly identify that `gl_FragCoord` should only be used as the translation of `SV_Position` for a fragment shader input (and not in the general case of any input). The other "fix" here is a kludge to handle the fact that the right translation for a GS input is not to an array called `gl_Position`, but to the syntax `gl_in[index].gl_Position` (array-of-structs style). I am doing this by attaching a custom decoration to the global variable we create for `gl_Position` and then intercepting it during code emit. I'm not proud of this, and would like to do something better given time. * Fix GLSL output for matrix-scalar multiplication The output logic was assuming that any use of `operator` in the input code that yields a value of matrix type must be translated to a `matrixCompMult()` call in GLSL, but this should really only apply if both of the operands* are matrices (not just based on the result type). As a result matrix-times-scalar operations were emitting a call to `matrixCompMult()` and GLSL was complaining because it won't implicitly promote scalars to matrices.	05 April 2018, 23:15:45 UTC
071566c	jsmall-nvidia	04 April 2018, 21:23:49 UTC	Switch on dx12 testing for remaining render tests. (#477) * Dx12 rendering works in test framework. * Turn on dx12 render tests.	04 April 2018, 21:23:49 UTC
357bce2	jsmall-nvidia	04 April 2018, 17:39:42 UTC	Dx12 rendering works in test framework. (#476)	04 April 2018, 17:39:42 UTC
bb6045d	Tim Foley	04 April 2018, 15:14:57 UTC	Add an EditorConfig file (#474) Developers may need to install an appropriate plugin for their text editor. For example, Visual Studio 2015 users can install the plugin from `Tools -> Extensions and Updates...`. Visual Studio 2017 supports EditorConfig directly. The main goal of this PR is to allow VS users to more easily toggle between projects that default to spaces or tabs for indentation without having to change their VS options. One side effect, however, is that EditorConfig might get overzealous and replace tabs with spaces in any file that is edited, so that PRs may contain spurious changes for a bit. The best fix for this would be to check in some project-wide fixup changes for formatting.	04 April 2018, 15:14:57 UTC
7a8ed51	Tim Foley	04 April 2018, 14:18:33 UTC	Pass AST interpolation modifiers through to codegen. (#475) This is a short-term fix, because we (1) don't have an IR-level representation of interpolation qualifiers, and (2) can't introduce one until after the IR-level type system is introduced (to be able to handle `struct` fields). The approach here is to find the AST-level declaration, either from layout information (in the case of an ordinary variable or function parameter), or from struct field information (because structs are being output from the AST form anyway). I've included a single end-to-end rendering test to confirm that we handle the `nointerpolation` modifier the same as HLSL. I also added the `noperspective` modifier, which seemed to be missing from our implementation.	04 April 2018, 14:18:33 UTC
3115ba7	jsmall-nvidia	03 April 2018, 16:25:51 UTC	Fixes based on review of feature/dx12 PR. (#473)	03 April 2018, 16:25:51 UTC
499e258	Tim Foley	03 April 2018, 01:39:15 UTC	Implement "operator comma" in IR codegen (#472) Fixes #471	03 April 2018, 01:39:15 UTC
38d5ef4	jsmall-nvidia	02 April 2018, 23:59:33 UTC	Feature/dx12 (#469) * Fix signed/unsigned comparison warning. * Split out d3d functions that will work across dx11 and 12. * Improve slang-test/README.md around command line options. * Make Guid comparison honor alignment for comparisons, such that mechanism work on architectures that can only do aligned accesses. * Initial setup of D3D12 Renderer, with presentFrame and clearFrame. * More support for D3D12 * Added FreeList * Added D3D12CircularResourceHeap * First attempt at createBuffer * First pass at map/unmap. * First pass binding vertex/constant buffers, and setting up InputLayout. Note that memory is not kept in scope on binding yet. * First pass of D3DDescriptorHeap * Small tidy up in render-d3d11. Added D3DDescriptorHeap to project. * First pass at D3D12 bind state. * Fix typos in D3D12Resource * Tidy up Dx11 render binding a little to match more with Dx12 style. * First pass at Dx12 BindingState * Handling of the command list d3d12. Support for submitGpuWork and waitForGpu. * First attempt at Dx12 capture of backbuffer to file. * First attempt at D3D12 binding for graphics. * D3D12 setup viewport etc - does now render triangle in render0.hlsl. * First pass at support for compute on D3D12Renderer * Use spaces over tabs in D3DUtil * Tabs to spaces in D3D12DescriptorHeap * Convert tab->spaces on render-d3d12.cpp	02 April 2018, 23:59:33 UTC
5a02738	Tim Foley	02 April 2018, 22:08:21 UTC	Update to top-of-tree glslang (2018-04-02). (#470) This is an attempt to alleviate some driver crashes caused by invalid SPIR-V. Because Slang drives `glslang` with GLSL source code, any invalid output is likely due to `glslang` bugs. I chose top-of-tree for `glslang` because it wasn't clear what their release process is. Hopefully we can go another year without having to update this dependency. The build setup we use for `glslang` had to change to account for one more `#define` that the code expects to have passed in externally.	02 April 2018, 22:08:21 UTC
bd66d4f	Tim Foley	30 March 2018, 23:53:07 UTC	Fix several issues discovered by Falcor (#467) Fixes #466 Most of these are Vulkan-related regressions. * Kludge the definition of `GroupMemoryBarrierWithGroupSync()` for GLSL so that it works around parentheses that the emit logic now introduces. * Don't emit `static` for global constants when targetting GLSL * Emit the `flat` modifier for varying input/output with integer type, when targetting GLSL * Avoid checking parameter default-value expressions more than once, because this can crash when the checking introduces syntax that is not expected to appear in the input AST	30 March 2018, 23:53:07 UTC
87c50cf	Tim Foley	29 March 2018, 22:47:58 UTC	Avoid crash when bad argument given to [instance(...)] attribute (#464) Fixes #463 Some of the attributes were failing to check for a `null` result from `checkConstantIntVal`, and so they crashed when a bad expression was used in an attribute. The particular way this had been triggered was that a user put an HLSL geometry shader in the same file with other code, using an entry point like: ```hlsl [instance(COUNT)] void myGeometryShader(...) {...} ``` They then defined `COUNT` as a preprocessor macro when compiling using the GS, but left it undefined otherwise. The result was that the argument to the `instance` attribute would fail to type check, and thus wouldn't count as a constant integer value, so that `checkConstantIntVal` returns `null` and results in the crash. The workaround for the user is to always define `COUNT`, even when not compiling the GS. The fix in the compiler is to guard against `null` in these cases and bail out of attribute checking. I also implemented logic so that `CheckIntegerConstantExpression` (which is invoked by `checkConstantIntVal`) will not produce an additional error message if the underlying expression failed to type check. In this casem the user will get an `undefined identifier: COUNT` error message, and we don't need to waste their time by also telling them that this isn't a compile-time constant expression.	29 March 2018, 22:47:58 UTC
b61371d	Tim Foley	29 March 2018, 20:40:55 UTC	Change uses of "spire" to "slang" (#461) Fixes #350 When the Slang project forked off from the Spire research effort, we renamed things as we went, but many cases seem to have slipped through the cracks. The two biggest diffs here are: - The `hello` example program was incorrectly talking about what was in the shader file (Slang no longer supports the "module" or "pipeline" constructs from Spire), and so it wasn't just a simple rename. - The files under `tests/bindings` were mistakenly using `__SPIRE__` as a preprocessor guard, which means that they weren't actually testing what they meant to. Luckily, it looks like the relevant functionality didn't regress while these tests were unintentionally deactivated.	29 March 2018, 20:40:55 UTC
8c50f9f	Tim Foley	29 March 2018, 18:57:57 UTC	Make IR-based output code cleaner (#458) The big change here is that rather than trying to reproduce the exact line number and indentation of names as they appeared in the original code (which had been appropriate for the AST-to-AST translation strategy), we now emit code from the IR using a simple "pretty-printing" strategy, where indentation is determined by nesting. Along the way I deleted a bunch of dead code in `emit.cpp` that was handling emit from AST declarations/statements/expressions. I probably should have pulled that out into its own change, but doing that now would be tricky. This change also makes it so that we do not emit `#line` directives by default. Asking for `#line` directives in the output will probably become part of a Slang "debug mode" that tries to make the output code suitable for step-through debugging.	29 March 2018, 18:57:57 UTC
b4c4dc9	Tim Foley	29 March 2018, 17:39:21 UTC	Add support for default parameter values in IR codegen (#459) Fixes #61 When lowering from AST to IR, if a call site doesn't supply an argument expression for each of the parameters to the callee, then use the default value expressions (stored as the "initializer" of the parameter decl) for each omitted parameter. This relies on the front-end to have already checked the call site for validity. Along the way I also cleaned up some of the checking of parameter declarations so that it is more like the checking of ordinary variable declarations (although the code is not yet shared). I also cleaned out some dead cases in the lowering logic for when we don't actually have a declaration available for a callee (these would only matter if we supported functions as first-class values). I added a simple test case to confirm that call sites both with and without the optional parameter work as expected. The strategy in this change is extremely simplistic, and might only be appropriate for default parameter value expressions that are compile-time constants (which should be the 99% case). This may require a major overhaul if we decide to handle default parameter values differently (e.g., by generating extra functions to ensure that the separate compilation story is what we want). Another issue that could change a lot of this logic would be if we start to support by-name parameters at call sites, since we could no longer assume that the argument and parameter lists align one-to-one (with the argument list possibly being shorter). Any work to add more flexible argument passing conventions would need to build a suitable structure to map from arguments to parameters, or vice-versa.	29 March 2018, 17:39:21 UTC
184dc5c	Tim Foley	28 March 2018, 15:17:48 UTC	Merge from v0.9.15 (#460) * Fix bug when subscripting a type that must be split (#396) The logic was creating a `PairPseudoExpr` as part of a subscript (`operator[]`) operation, but neglecting to fill in its `pairInfo` field, which led to a null-pointer crash further along. * Allow writes to UAV textures (#416) Work on #415 This issue is already fixed in the `v0.10.` line, but I'm back-porting the fix to `v0.9.`. The issue here was that the stdlib declarations for texture types were only including the `get` accessor for subscript operations, even if the texture was write-able. I've also included the fixes for other subscript accessors in the stdlib (notably that `OutputPatch<T>` is readable, but not writable, despite what the name seems to imply). * Fix infinite loop in semantic parsing (#424) The code for parsing semantics was looking for a fixed set of tokens to terminate a semantic list, rather than assuming that whenever you don't see a `:` ahead, you probably are done with semantics. This meant that you could get into an infinite loop just with simple mistakes like leaving out a `;`. This change fixes the parser to note infinite loop in this case, and adds a test case to verify the fix. * Expose HLSL `shared` modifier through reflection. (#436) This is a request from Falcor, because the `shared` modifier can be used as a hint to optimize the grouping of parameters for binding. The intention is that `shared` marks shader parameters (including parameter blocks) that will us the same values across many draw calls (e.g., per-frame data, as opposed to per-model or per-instance). The mechanism I'm using here is to provide a general reflection API for exposing the `Modifier`s already attached to declarations. While the only modifier exposed is `shared`, and the only modifier information being exposed is presence/absence, this interface could be extended down the line.	28 March 2018, 15:17:48 UTC
17b66ef	Tim Foley	27 March 2018, 00:44:50 UTC	Unify all generic parameters, even if some mismatch (#454) * Fix decl-ref printing to handling NULL pointers If the underlying decl, or its name is NULL, then use an empty string for the declaration name. This issue was found when debugging, but could bite non-debug cases too, if we ever try to print something like a generic type constraint, which has no name. * Unify all generic parameters, even if some mismatch Fixes #449 The front end tries to infer the right generic arguments to use at a call site using a sloppily implemented "unification" approach. The basic idea is that if you pass a `vector<float,3>` into a function that operates ona `vector<T,N>` where `T` and `N` are generic paameters, then the unification will try to unify `vector<float,3>` with `vector<T,N>` which will lead to it recursively unifying `float` with `T` and `3` with `N`, at which point we have viable values to substitute in for those parameters. Where the existing approach is maybe not quite right is in how it handles obvious unification failures. So if we ask the code to unify, say, `float` with `uint`, it will bail out immediately because those can't be unified. This sounds right superficially, but in some cases with might be calling a function that takes a `vector<float,N>` and passing a `vector<uint,3>` and we'd like to at least get far enough along with unification to see that `N` should be `3` so that the front end can maybe decide to call the function anyway, with some amount of implicit conversion. Over time I've had to modify a lot of the "unification" logic so that it doesn't treat the obvious failures as a hard stop, and instead just returns the failure as a boolean status, but keeps on trying to unify things even after such a failure. When doing unification as part of inference for generic arguments, there will usually be subsequent steps (e.g., type conversions for function aguments) that will catch the type errors that arise. This specific change is to make is so that when unifying the substitutions for a generic decl-ref, we try to unify all the pair-wise arguments, and don't bail out on the first mismatch (so that the `float`-vs-`uint` failure above doesn't lead to us skipping the `3` and `N` pairing). The one case we need to watch out for in all of this is when unification is used to check if an `extension` declaration (which might be generic) is actually application to a concrete type. In that case we obviously don't want an extension for `vector<float,N>` to apply to `vector<uint,3>`, so it is important that the extension case check the return status from the unification logic (or in the future, it could just confirm that the substituted type is equivalent to the original as a post-process...). I've added a test case that reproduces the original failure that surfaced the bug. * fixup: add expected test output	27 March 2018, 00:44:50 UTC
6d400b6	jsmall-nvidia	26 March 2018, 20:10:43 UTC	Fix signed/unsigned comparison warning. (#455)	26 March 2018, 20:10:43 UTC
74bf38b	jsmall-nvidia	26 March 2018, 19:34:01 UTC	Renderer resource mangement for render-test (#453) * First pass at resource based renderer using RefObject. * Correct handling of array of buffer pointers to Dx11. * Fix bug with setting viewOut incorrectly in createInputTexture. * More support for allowing com like interfaces. * Added and tidied Slang::Result - adding interface specific results * Guid added comparison support, and made base interface IComUnknown - with lowerCamel methods	26 March 2018, 19:34:01 UTC
5000d27	jsmall-nvidia	22 March 2018, 22:08:19 UTC	Tidy up of Renderer (#452) * Fixed some small typos in api-users-guide.md * Fix some small typos in slang-test/main.cpp, render-test/render-d3d11.cpp * Remove exit() calls from test code. Added Slang::Result, which works in the same way as COM HRESULT. * FIx bug introduced when moving to Slang::Result - handling E_INVALIDARG on Dx11. * Fix the testing of feature levels on Dx11 renderer. * First attempt at README.md for slang-test. * Tidied up the slang-test README.md file. * Fix some small typos in tools/slang-test/main.cpp * Fix spaces -> tabs problems. Fix some small types. * Refactor Renderer implementations such that: * Class definition does not contain long implementation/s * Removed unused globals * Ordered implementation after class definition * Made renderer specific classes child classes, and use Impl postfix to differentiate * Converted tabs into spaces * First pass at Slang::ComPtr. Added slang-defines.h which sets up some fairly commonly used defines such as SLANG_FORCE_INLINE, compiler detection, os detection, and some other cross platform features. * * Fixed bug in vk renderer - where features structure not initialized on hkCreateDevice * Make member variables in Renderer implementations use prefix * Updated test README.md to document that free parameter can control what test is run * * changed setClearColor to take an array of 4 floats to make API clearer on usage * mix of type usage style - defaulted to more conventional style * * Fixed swapWith * Use SLANG_FORCE_INLINE * Don't bother initializing List data when type is POD * Added convenience macro for Result handling SLANG_RETURN_NULL_ON_ERROR	22 March 2018, 22:08:19 UTC
5e720e7	Tim Foley	22 March 2018, 18:21:45 UTC	Add support for DirectX Raytracing (DXR) (#451) * Add support for DirectX Raytracing (DXR) This is an initial pass to add support to Slang for the shader stages introduced by DirectX Raytracing (DXR). * Add declarations for DXR intrinsic types and functions to the Slang standard library. The way our compilation works, these will then get propagated through the IR as intrinsics and get spit back out again as-is during HLSL code emission. * Declare the DXR-related stages. This is the main work that affects the compiler's C++ implementation rather than being something we can add via the standard library today. * Switch around the encoding of the `Profile` type so that the stage is in the low bits, allowing API users to pass an ordinary `SlangStage` to operations that expect a `SlangProfileID`. - This represents a direction I'd like to push in long term, where the user specifies stage and "feature level" separately rather than using composite profiles like `vs_6_0`. The introduction of these new stages seems like a good point to try and make a clean break here and not introduce, e.g., `rgs_6_1` for ray generatin shaders. * Upgrade "effective profile" computation so that it advances the required version based on the specified stage (e.g., DXR stages seem to require at least shader model 6.1). - This is a bit of a kludge overall, but ideally we don't want a typical user to have to think about "feature level" stuff much at all. The ideal workflow is that they just hand us a source file and we work out entry points and their required feature levels in the compiler (and let the user query it when we are done). Until we implement that for real, stopgaps like this are required. Overall these are relatively small changes for supporting some major new API behavior. Slang's design helps out here, by allowing a lot of things to be specified in the stdlib (including generic intrinsic functions), but some of this is also owed to the DXIL-influenced design of DXR - e.g., the use of global functions in place of `SV_` semantics. fixup: typos * Fixup: use `pixel` instead of `fragment` as primary stage name This is to match HLSL conventions when generating output code, even if the Slang project officially favors the more correct term "fragment shader."	22 March 2018, 18:21:45 UTC
d421988	jsmall-nvidia	21 March 2018, 18:28:43 UTC	First pass impls on ComPtr and reorganise Renderer (#450) * Fixed some small typos in api-users-guide.md * Fix some small typos in slang-test/main.cpp, render-test/render-d3d11.cpp * Remove exit() calls from test code. Added Slang::Result, which works in the same way as COM HRESULT. * FIx bug introduced when moving to Slang::Result - handling E_INVALIDARG on Dx11. * Fix the testing of feature levels on Dx11 renderer. * First attempt at README.md for slang-test. * Tidied up the slang-test README.md file. * Fix some small typos in tools/slang-test/main.cpp * Fix spaces -> tabs problems. Fix some small types. * Refactor Renderer implementations such that: * Class definition does not contain long implementation/s * Removed unused globals * Ordered implementation after class definition * Made renderer specific classes child classes, and use Impl postfix to differentiate * Converted tabs into spaces * First pass at Slang::ComPtr. Added slang-defines.h which sets up some fairly commonly used defines such as SLANG_FORCE_INLINE, compiler detection, os detection, and some other cross platform features.	21 March 2018, 18:28:43 UTC
98b8e0c	jsmall-nvidia	20 March 2018, 21:14:12 UTC	SlangResult and small bug/typos fixes (#448) * Fixed some small typos in api-users-guide.md * Fix some small typos in slang-test/main.cpp, render-test/render-d3d11.cpp * Remove exit() calls from test code. Added Slang::Result, which works in the same way as COM HRESULT. * FIx bug introduced when moving to Slang::Result - handling E_INVALIDARG on Dx11. * Fix the testing of feature levels on Dx11 renderer. * First attempt at README.md for slang-test. * Tidied up the slang-test README.md file. * Fix some small typos in tools/slang-test/main.cpp * Fix spaces -> tabs problems. Fix some small types.	20 March 2018, 21:14:12 UTC
7256214	Tim Foley	19 March 2018, 19:45:23 UTC	Entry point attribute (#447) * Typo * Add [shader(...)] and clean up some literal handling * Add supporting for validating the `[shader(...)]` attribute, by checking that its argument is a string literal that names a known shader stage. * Split the `ConstantExpr` class into distinct subclasses rooted at `LiteralExpr`, so we have `BoolLiteralExpr`, `IntegerLiteralExpr`, `FloatingPointLiteralExpr`, and `StringLiteralExpr` * Add a `String` type to the stdlib, to be used as the type of a string literal. This change allows code using `[shader(...)]` to be accepted by the front-end again, but it does nothing about emitting it in final HLSL. * Allow entry points to be specified via [shader(...)] Before this change, the compiler would track a list of `EntryPointRequest` objects, based on what the suer specified via API and/or command-line options. Each entry point request would get matched up with an AST `FuncDecl` as part of semantic checking, and then the back end steps (layout, codegen, etc.) would work from that information. This change makes the compiler modal, in that it can either continue to use an explicit list of entry point requests (this is the mode when the list is non-empty), or it can rely on user-supplied attributes on entry point functions to drive codegen (this is the mode when the list is empty). User-specified `[shader(...)]` attributes are processed at the same place where the association from `EntryPointRequest`s to `FuncDecl`s would otherwise be made, and basically does the same thing in the opposite direction: looks for `FuncDecl`s with the appropriate attribute and synthesizes an `EntryPointRequest` for them. Subsequent processing should ideally not know where a given `EntryPointRequest` came from, and should handle both methods of specifying the entry points equivalently. One design choice that might not make immediate sense is that we do not process a function as an entry point (applying further validation, etc.) just because it has a `[shader(...)]` modifier, unless we are in the appropriate mode (which in this case is the mode where the user didn't specify their own entry points via API or command line). This is to handle cases where the user wants to explicitly compile only one entry point, so that they (1) don't want us to spend time validating code they don't care about, (2) don't want do get output they don't expect, and (3) might actually be presenting us with code that violates the language rules due to a combination of `#define`s in effect (e.g., they might have a `[shader("vertex")]` function that transitively executes a `discard` because of how the preprocessor was configured, but they don't care because they are compiling a fragment entry point). This decision might be something we revisit over time. As part of this work, I had to add some logic to pick a "profile version" to use for a combination of a target and stage (because when you specify `[shader("vertex")]` the compiler can't tell if you want `vs_5_0`, `vs_5_1`, etc.). This isn't really complete right now, because something like `-target dxbc` also doesn't determine a profile, so there is a bit of a kludge at present. We need to figure out a good long-term plan here, which might involve keeping target format, feature level/version, and pipeline stage as truly orthogonal concepts, rather than conflating them. That would involve more work in the API and command-line layers to de-compose things when the user specifies, e.g., `vs_5_1`, but might make downstream logic easier to manage. * Emit [shader(...)] attribute on entry point for SM 6.1 and later This should help ensure that the output from Slang can be compiled with dxc `lib_` profiles. Fix warning	19 March 2018, 19:45:23 UTC
04c9476	jsmall-nvidia	16 March 2018, 19:02:55 UTC	Fixed some small typos in api-users-guide.md (#446)	16 March 2018, 19:02:55 UTC
a549362	Yong He	16 March 2018, 18:28:04 UTC	Small bug fixes. (#445) This commit contains two small bug fixes: 1. In `specializeProgramLayout`, we cannot assume the resourceInfo entries in a varLayout and its corresponding type layout has the same order. Should use `FindResourceLayout`. 2. When generating ir for a switch statement, make sure to remove the breakLabel from the shared context when done. For some reason if a switch statement is being lowered twice, the Dictionary::Add method will complain the statement key already exists.	16 March 2018, 18:28:04 UTC
4c23ba2	Tim Foley	16 March 2018, 16:06:01 UTC	Typos (#444)	16 March 2018, 16:06:01 UTC
93ac152	Tim Foley	16 March 2018, 14:01:35 UTC	Overhaul implementation of [attributes] (#443) The existing code parsed all of the square-bracket `[attributes]` into `HLSLUncheckedAttribute`, and then went on to hand-convert some of them to specialized subclasses of `HLSLAttribute`. When attributes didn't check, they were left as-is, and no error message was issued, because at the time the compiler was focused on accepting arbitrary input. This change greatly overhauls the handling of `[attributes]`. Attributes are now declared in the stdlib, with declarations like: ```hlsl __attributeTarget(LoopStmt) attribute_syntax [unroll(count: int = 0)] : UnrollAttribute; ``` In this syntax, the `unroll` part is giving the attribute name (the `[]` are just for flavor, to make the declaration look like a use site; we could drop it if we don't like the clutter), the `count` is a parameter of the attribute, which we expect to be of type `int`, and which has a default value of `0` if unspecified. The `: UnrollAttribute` part specifies the meta-level C++ class that will implement this attribute (and corresponds to a class in `modifier-defs.h`). This syntax is similar to our current `syntax` declarations. I'm starting to think we should change it to something like a `__meta_class(UnrollAttribute)` modifier, and then use that uniformly across all cases (e.g., also replacing the curreent `__magic_type(Foo)` syntax). The `__attributeTarget(LoopStmt)` is a modifier that specifies the meta-level C++ class for syntax that this attribute is allowed to attach to. It is legal to have more than one of these. Attributes continue to be parsed in an unchecked form, so that we don't tie up semantic analysis and parsing more than necessary. During checking, we look up the attribute name in the current scope, and then replace the unchecked attribute with a more specific one if the checking passes. Checking proceeds in generic and attribute-specific phases. The generic phase includes checking the number of arguments against those specified in the attribute declaration (I don't currently check types, or handle default arguments), and then checking that at least one `__attributeTarget(...)` modifier applies to the syntax node being modified. The attribute-specific phase then applies to the specialized C++ subclass of `Attribute`, and does the actual checking right now (e.g., that step is responsible for actually type-checking things at present). This can obviously be improved over time. With this support I went ahead and added declarations for all the HLSL attributes I could find documented on MSDN. I also added a provisional declaration for the `[shader(...)]` attribute that has been added to dxc, but which is not yet documented. One important detail here is that lookup of attribute names needs to be done carefully, so that we don't let, e.g., local variables shadow an attribute declaration: ```hlsl int unroll = 5; // This attribute should not get confused by the local variable `unroll` [unroll] for(...) { .. } ``` The lookup logic already has a notion of a `LookupMask` that can be used to filter declarations out of the result. In this change I surfaced that mask through the main lookup API (rather than requiring a second pass to "refine" lookup results), and made is so that the default lookup mask does not include attributes, while an explicit mask can be used to look up only attributes. (An alternatie design we discussed was to follow the approach of C# and have the declaration of an attribute like `[unroll]` actually be `unrollAttribute`, with a suffix. I decided not to follow that approach for now because it seemed like printing good error messages in that case could require us to carefully trim the `Attribute` suffix off of names at times, and using the existing mask behavior seemed simpler.) To verify that the shadowing behavior is indeed correct, I modified the `loop-unroll.slang` test case. Smaller notes: * Removed the `HLSL` prefix from several of the C++ attribute classes * Made sure to actually validate the modifiers on statements * Special-cased checking for `ParamDecl` with a null type, because I'm re-using `ParamDecl` for attribute parameters, but can't give a concrete type to some of them right now * Deleting some old, dead emit-from-AST logic around attributes, rather than try to "fix" code that doesn't run (a more complete scrub of that code is still needed) * Fixed AST inheritance hierarchy so that a `Modifier` is a `SyntaxNode` rather than a `SyntaxNodeBase`. I have no idea why we have both of those, and we need to clean that up soon.	16 March 2018, 14:01:35 UTC
e69c0bd	Tim Foley	14 March 2018, 19:52:59 UTC	When emitting from IR, skip structured types with `__builtin` modifier (#442) This allows users who call `spAddBuiltins` to specify `struct` types that are provided for a target, and that should not be part of the generated HLSL/GLSL from Slang. The existing emit logic already skips emitting any basic, vector, matrix, or resource types, so this case really only triggers when the standard library needs to declare a completely ordinary `struct`. No provision is made right now for a `struct` type that might be builtin on one target, but need to be declared on another. That is a clear next step for this feature.	14 March 2018, 19:52:59 UTC
972dda6	Yong He	12 March 2018, 19:07:40 UTC	Merge pull request #441 from tfoleyNV/legalization-fix Don't use specialized field names when legalizing types	12 March 2018, 19:07:40 UTC
3f2b123	Yong He	12 March 2018, 19:07:29 UTC	Merge branch 'master' into legalization-fix	12 March 2018, 19:07:29 UTC
e08f0c6	Yong He	12 March 2018, 18:02:28 UTC	Stop compilation when a imported module contains errors. (#440) * Stop compilation when a important module contains errors. * Fixup test cases	12 March 2018, 18:02:28 UTC
e8f44d4	Tim Foley	12 March 2018, 17:26:17 UTC	Don't use specialized field names when legalizing types The type legalization logic currently looks up layout information for the fields of an aggregate type using the mangled name of the field (since this should be consistent even when legalization changes the number of fields). The problem at present is that type layouts for specialized generic types were storing the mangled names of specialized field decl-refs, while the lookup was using the mangled names of the unspecialized declarations. The latter (unspecialized names) is what we want for simplicity, so this change fixes the comparison.	12 March 2018, 17:30:33 UTC
51bc468	Yong He	12 March 2018, 16:40:16 UTC	Prevent duplicating global values during specialization (#439) * Prevent duplicating global values during specialization. * Fixup checking code	12 March 2018, 16:40:16 UTC
a22b752	Tim Foley	08 March 2018, 20:35:00 UTC	Cleanups on slang-generate (#437) * Cleanups on slang-generate There is nothing too significant in these changes, but I'm trying to get things in place so that we can: - Clean up the stdlib code to do less explicit `StringBuilder` operations and instead to use more of the "template engine" approach - Start using slang-generate for code other than the slang stdlib, so that we can generate more of our boilerplate. The main new functionality here is that in a template/meta file, you can now enclose an expression in `$(...)` to indicate that is should be spliced into the result. E.g. instead of: class ${{ sb << someClassName; }} { ... } We can now write: class $(someClassName) { ... } The other bit of new functionality is support for a whole-line statement escape, so that instead of: ${{ for( auto a : someCollection ) { }} void $(a)() { ... } ${{ } }} We can instead write: $: for(auto a : someCollection) { void $(a)() { ... } $: } I haven't yet tried to use that functionality in the stdlib meta-code, but doing so would be an obvious next step. * Fixup: change some $P to $p The capitalization on some of the GLSL intrinsic mappings got messed up during a find-and-replace operation when removing the double `$` that used to be required to escape things.	08 March 2018, 20:35:00 UTC
ed718ba	Yong He	06 March 2018, 21:12:37 UTC	Add a case to `TryUnifyVals` to cover `SubtypeWitness` vals (#435)	06 March 2018, 21:12:37 UTC
1fef9b4	Tim Foley	03 March 2018, 15:16:08 UTC	IR: next phase of "everything is an instruction" (#433) The main practical change here is that things that used to be `IRValue`s, like literals, are now being expressed as instructions in the global scope. In order to validate that things are actually being handled correctly, this change introduces an explicit "validation" pass that can be run on the IR to check for different invariants (although it doesn't check many of the important ones right now). I've left the validation pass turned off by default, but with a command-line flag to enable it. We may want to make it be on by default in debug builds, just to keep us honest. The main invariant for the moment is that when on IR instruction is used as an operand to another, it had better come from the same IR module. Some of the existing passes were violating this rule, in particular when it came to cloning of witness tables related to global generic parameter substitution. Those features can in theory be handled better now by allowing `specialize` instructions at other scopes, but I didn't want to over-complicate this change, so I make just enough fixes to ensure that these steps always clone witness tables they get from the "symbols" on an IR specialization context. In order for this to work when recursively specializing, I had to ensure that the logic for generic specialization had a notion of a "parent" specialization context that it would fall back to to perform cloning when necessary. This change keeps the logic that was caching and re-using the instructions for literal values within a module, but adds some logic that isn't really being tested right now for picking the right parent instruction to insert a constant instruction into. This logic doesn't trigger right now because all of the cases we are using it on have zero operands (and so they always get "hoisted" to the global scope), but eventually for things like types we want to be able to support instructions with operands (e.g., `vector<float, 4>`) and handle the case where some of those operands come from different scopes (e.g., when nested inside a generic). The final change here is mostly cosmetic: the `IRBuilder` is now more abstract about where insertion occurs: it tracks a single `IRParentInst` to insert into, and then an optional `IRInst` to insert before. In the common case, that parent is an `IRBlock`, but it could conceivably also be the global scope, or a witness table, etc. Use sites where we used to change those fields directly now use distinct methods `setInsertInto(parent)` and `setInsertBefore(inst)` which capture the two cases we care about. Accessors are also defined to extract the current block (if the current parent is a block), and the current "function" (global value with code, if the current parent is a global value with code, or a block inside one). With this work in place, it should be possible for a follow-on change to start putting `specialize` instructions at the global scope and thus clean up some of the on-the-fly specialization work. This work should also help with some of the requirements around a distinct IR-level type system and more explicit generics.	03 March 2018, 15:16:08 UTC
41dc26b	Tim Foley	01 March 2018, 21:11:24 UTC	IR: "everything is an instruction" (#432) * IR: "everything is an instruction" This change tries to streamline the representation of the IR in the following ways: * Every IR value is an instruction (there is no `IRValue` type any more) * All IR values that can contain other values share a single base (`IRParentInstruction`) * Dynamic casts to specific IR instruction types can be accomplished with a new `as<Type>(inst)` operation, that uses the IR opcode to implement casts. The biggest change in terms of number of lines is getting rid of `IRValue`. The diff here could probably be smaller if I'd just done `typedef IRInst IRValue;`. Along the way I also renamed the `getArg`/`getArgs`/`getArgCount` combination over to `getOperand`/`getOperands`/`getOperandCount` to avoid being confusing when we have something like a `call` instruction where the "arguments" of the call don't line up with the operands of the instruction. I also tried to clean up the representation of lists of child instructions to try to make it easier to iterate over them with C++ range-based `for` loops. Developers still need to be careful about mutating the contents of a block while iterating over it in this fashion (e.g. if you remove the "current" element, the iteration will end prematurely). Probably the thorniest change here is that parameters are now just represented as the first N instructions in a block, which means: * We need to perform a linear search to find the end of the parameter list. This is probably not often a problem, because usually you would be iterating over the parameters anyway, and that will be linear in the number of parameters. * Algorithms that iterate over a block either need to ignore parameters, treat parameters just like other instructions, or somehow cleave the list into the range of parameters, and the range of "ordinary" instructions (which involves the same linear search above). * When inserting into a block, we need to be careful not to insert instructions at invalid locations (e.g., insert a temporary before the parameters, or insert a parameter in the middle of the code). I can't pretend that I've handled the details of that here. (This is no different than having to make the same adjustments for phi nodes in a typical SSA representation) * One possible future-proof approach is to implement a pass that sorts the instructions in a block so that parameters always come first. That would let us implement passes without caring about this detail, and then clean up right before any pass that cares about the relative order of parameters and other instructions. The current change is missing any work to make literals and other instructions that used to be `IRValue`s properly nest inside of their parent module. Right now these instructions are just left unparented, and may actually end up being shared between distinct modules. Fixing that will need a follow-up change. The biggest challenge there is that it introduces instructions at the global scope that aren't `IRGlobalValue`s. This change doesn't try to take advantage of any of the new flexibility (e.g., by nesting `specialize` instructions inside of witness tables). The goal is to do exactly what we were doing before, just with a different representation. * Warning fix	01 March 2018, 21:11:24 UTC
372900b	Tim Foley	27 February 2018, 00:28:24 UTC	Add GLSL translations for bit manipulation intrinsics (#430) The following translations are added: * HLSL `countbits` becomes GLSL `bitCount` * HLSL `firstbitlow` becomes GLSL `findLSB` * HLSL `firstbithight` becomes GLSL `findMSB` * HLSL `rerverseBits` becomes GLSL `bitfieldReverse` There are currently no HLSL equivalents for the bitfield insert/extract operations in GLSL. In the future we could expose those as intrinsics under their GLSL names, with HLSL translations, if desired.	27 February 2018, 00:28:24 UTC
c3bc496	Tim Foley	26 February 2018, 22:02:29 UTC	Merge from 0.9.x (#429) * Fix bug when subscripting a type that must be split (#396) The logic was creating a `PairPseudoExpr` as part of a subscript (`operator[]`) operation, but neglecting to fill in its `pairInfo` field, which led to a null-pointer crash further along. * Allow writes to UAV textures (#416) Work on #415 This issue is already fixed in the `v0.10.` line, but I'm back-porting the fix to `v0.9.`. The issue here was that the stdlib declarations for texture types were only including the `get` accessor for subscript operations, even if the texture was write-able. I've also included the fixes for other subscript accessors in the stdlib (notably that `OutputPatch<T>` is readable, but not writable, despite what the name seems to imply). * Fix infinite loop in semantic parsing (#424) The code for parsing semantics was looking for a fixed set of tokens to terminate a semantic list, rather than assuming that whenever you don't see a `:` ahead, you probably are done with semantics. This meant that you could get into an infinite loop just with simple mistakes like leaving out a `;`. This change fixes the parser to note infinite loop in this case, and adds a test case to verify the fix.	26 February 2018, 22:02:29 UTC
28887ae	Tim Foley	26 February 2018, 20:14:24 UTC	Fix GS stream types (#428) The code in `DeclRefType::Create` was treating all of the point/line/triangle output stream types as `HLSLPointStreamType`, which meant we always output GLSL geometry shaders with `layout(points) out;`.	26 February 2018, 20:14:24 UTC
b942849	Yong He	23 February 2018, 23:32:43 UTC	Merge pull request #426 from csyonghe/irtypesstep0 Refactor IR type system, step 0	23 February 2018, 23:32:43 UTC
5ab20eb	Yong He	23 February 2018, 22:22:36 UTC	Refactor IR type system, step 0 Pull BaseType, TextureFlavor and SamplerStateFlavor enums and helper functions into a shared file "type-system-shared.h".	23 February 2018, 22:41:46 UTC
7066759	Tim Foley	23 February 2018, 18:39:23 UTC	Initial support for cross-compilation of geometry shaders to GLSL (#423) These changes are related to getting a first Slang geometry shader to translate to GLSL. There are some unrelated cross-compilation fixes in here as well. * Add direct support to shader parameter layout for GS output streams, so that they are reflected as a container type * Fix the declarations of the `SampleCmp` methods; they should always return `float`, independent of the nominal element type of the texture. * Fix up our handling of `__target_intrinsic` modifiers, so that we are a little bit more careful in how we detect something as being just a simple name replacement (e.g., `__target_intrinsic(glsl, "foo")` should make us output `foo(original, args, here)`) vs. a custom expression (e.g., `__target_intrinsic(glsl, "bar+1")` should output `bar+1` and not use any arguments, even without any `$` substitutions). * Don't emit the `[unroll]` modifier when outputting GLSL. Eventually we need to fully unroll loops for GLSL output anyway. * Inspect th entry point parameter list (from the layout information) when emitting a GS, so that we can write out the correct `layout` modifiers for input primitive type and output primitive topology. * Add a new case to `ScalarizedVal` to handle cases where an HLSL system value needs to map to a GLSL built-in variable with a slightly different type (e.g., `SV_RenderTargetArrayIndex` is a `uint` while `gl_Layer` is an `int`). For now this is only hanlding trivial cases (where a direct cast can achieve the result we want), but eventually it might need to handle things like conversion between arrays and vectors. * This is mostly just the infrastructure for the feature, and the actual enumeration of the correct types for all the system values is still to be done. * Handle a few more cases in assignment between `ScalarizedVal`. In particular, deal with cases where `materializeValue` is called on a tuple that has an array type, so that we need to construct the individual array elements. * Add translation for GS output stream `Append()` and `RestartStrip()` * Note that the translation of `Append()` seems to ignore its argument; this is because we desugar the operation during legalization for GLSL (see next item) * When legalizing for GLSL, detect an entry point parameter that is a GS stream, and translate it into `out` variables for its element type, and then rewrite any calls to `Append()` in the body of the entry point to be preceded by assignment to those variables. This works in tandem with the above translation of HLSL `Append()` calls into GLSL `EmitVertex()` calls. * We are detecting calls to `Append()` in a slightly hacky way, by looking at decorations on the callee to make sure that it is a function that is determined to translate to `EmitVertex()`. * Right now we aren't handling calls to `Append()` in other functions. It wouldn't be hard in principle to walk all the functions in the module and apply the translation (assuming we don't want to start supporting multiple output streams), but this wouldn't handle the passing of the GS output stream between functions. (This points out that there is a need for an additional type legalization pass that desugars away parameters of types that aren't actually meaningful on the target).	23 February 2018, 18:39:23 UTC
42c3a7c	Yong He	22 February 2018, 20:35:11 UTC	Merge pull request #422 from csyonghe/mangledName Make `IRGlobalValue::mangledName` a `Name*`	22 February 2018, 20:35:11 UTC
ac1dfba	Yong He	22 February 2018, 20:08:47 UTC	Make `IRGlobalValue::mangledName` a `Name*` This allows us to get rid of `IRGlobalValue::dispose()`.	22 February 2018, 20:09:50 UTC
7eaaf1a	Tim Foley	22 February 2018, 19:44:57 UTC	Initial work on validating "constexpr"-ness in IR (#420) * Initial work on validating "constexpr"-ness in IR The underlying issue here is that certain operations in the target shading languages constrain their operands to be compile-time constants. A notable example is the optional texel offset parameter to the `Texture2D.Sample` operation. When calling these operations in GLSL, the user is required to pass a "constant expression," and any variables in that expression must therefore be marked with the `const` qualifier (and themselves be initialized with constant expressions). Any GLSL output we generate must of course respect these rules. When calling these operations in HLSL, the user is not so constrained. Instead, they can pass an arbitrary expression, which may involve ordinary variables with no particular markup, and then the compiler is responsible for determining if the actual value after simplification works out to be a constant. In some cases, the requirement that a value be constant might actually trigger things like loop unrolling. Also, it is okay to use a function parameter to determine such a constant expression, as long as the argument turns out to be a constant at all call sites. The way we have decided to tackle these challenges in Slang is that we we propagate a notion of `constexpr`-ness through the IR. This is currently being tackled in `ir-constexpr.cpp` with a combination of forward and backward iterative dataflow: * When the operands to an instruction are all `constexpr`, and the opcode is one we believe can be constant-folded, then we infer that the instruction can be evaluated as `constexpr` * When instruction is required to be `constexpr`, then we infer that all of its operands are also required to be `constexpr`. If this process ever infers that a function parameter is required to be `constexpr`, then we might have to continue propagation at all the call sites to that function. If after all the propagation is done, there are any cases where an instruction is required to be `constexpr`, but it can't be `constexpr` (we weren't able to infer `constexpr`-ness for its operands), then we issue an error. This implementation encodes the idea of `constexpr`-ness in the IR as part of the type system, using a simplified notion of rates. This change adds a `RateQualifiedType` that can represent `@R T`, and then introduces a `ConstExprRate` that can be used for `R`. Many accessors for the type information on IR nodes were updated to distinguish when one wants the "full" type of an IR value (which might include rate information) vs. just the "data" type. A `constexpr` qualifier was added in the front-end, and is being used to decorate the texel offset parameter for `Texture2D.Sample`. Lowering from AST to IR looks for this qalifier and infers when a function parameter must be typed as `@ConstExpr T` instead of just `T`. There are lots of limitations and gotchas in the implementation so far: * The `@ConstExpr` rate is the only one added in this change, but it seems clear that the conceptual `ThreadGroup` rate that was added to represent `groupshared` should probably get folded into the representation. * I'm not 100% pleased with how many places in the IR I have to special-case for rate-qualified types. At the same type, pulling out rate as a distinct field on `IRValue` would probably require that we pay attention to rate everywhere. * I've added a test case to show that we can issue errors when users fail to provide a constant expression for the texel offset, but the actual error message isn't great because it doesn't indicate why a constant expression was required. Realistically the "initial IR" should contain a few more decorations we can use to relate error conditions back to the original code (even if this is in a side-band structure). * I've added a test case that is supposed to show that we can back-propagate `constexpr`-ness to local variables, and I've manually confirmed that it works for Vulkan/SPIR-V output, but the level of Vulkan support in `render_test` today means I can't enable the test for check-in. * While I'm attempting to propagate `@ConstExpr` information from callees to callers, I haven't implemented any logic to specialize callee functions based on values at call sites. * In a similar vein, there is no handling of control-flow dependence in the current code. If we infer that a phi (block parameter) needs to be `@ConstExpr`, then it isn't actually enough to require that the inputs to the phi (arguments from predecessor blocks) are all `@ConstExpr` because we also need any control-flow decisions that pick which incoming edge we take to be `@ConstExpr` as well. * As a practical matter, implicit propagation of `@ConstExpr` from a function body to a function parameter should only be allowed for functions that are "local" to a module. Any function that might be accessed from outside of a module should really have had its `@ConstExpr` parameter marked manually, and our pass should validate that they follow their own rules. Right now we have no kind of visibility (`public` vs `private`) system, so I'm kind of ignoring this issue. While that is a lot of gaps, this is also just enough code to get the Falcor MultiPassPostProcess example working, so I'm inclined to get it checked in. * Fixup: missing expected output for test * Fixup: disable test that relies on [unroll] for now	22 February 2018, 19:44:57 UTC
01c4134	Yong He	21 February 2018, 04:55:51 UTC	Merge pull request #417 from csyonghe/leakfix Fix IR memory leaks.	21 February 2018, 04:55:51 UTC
4cf46e5	Yong He	20 February 2018, 23:50:41 UTC	bug fix: witnessTable's subTypeDeclRef should have default substitution for getSpecailizedMangledName to work properly.	20 February 2018, 23:50:41 UTC
61a6d18	Yong He	20 February 2018, 21:34:59 UTC	make CompileRequest retain specailized IR module. This is to workaround with the issue that the Types returned in ProgramLayout may reference to IRWitnessTables via GlobalGenericParamSubstitution.	20 February 2018, 21:34:59 UTC
5de62bb	Yong He	20 February 2018, 00:53:45 UTC	more to fixing memory leaks 1. reorder destruction order of several key classes to avoid using deleted IR objects when destroying Types 2. remove Session::canonicalTypes and make each Type own a RefPtr to the canonicalType, to allow types to be destroyed along with each IRModule it belongs to.	20 February 2018, 00:53:45 UTC
ff8adf7	Yong He	19 February 2018, 18:00:51 UTC	Fix IR memory leaks. 1, make IRModule class own a memory pool for all IR object allocations 2. For now, we allow IR objects to own other (externally) heap allocated objects, such as String, List and RefPtrs by tracking all IR objects that has been allocated for the IRModule in a list named `IRModule::irObjectsToFree`. and call destructor for all these objects upon the destruction of the IRModule. In the long term, we should eliminate the use of all these externally allocated types in IR system and get rid of this tracking and explicit destructor calls. 3. remove non-generic `createValueImpl` functions and retain only generic versions in IRBulider so we can properly call the constructor of the IR types to set up virtual tables correctly for destructor dispatching. 4. add `MemoryPool` class for allocation of the IR objects. 5. Make sure we are disposing IRSpecContexts when we are done with the specialized IR module. 6. Add `_CrtDumpMemoryLeaks()` calls to check memory leaks upon destruction of a Slang session. If we are to support multiple sessions at a time, this call should probably be replaced with the more advanced MemoryState versions of the memory leak checker.	19 February 2018, 20:26:26 UTC
51cdcad	Tim Foley	18 February 2018, 16:08:46 UTC	stdlib fixes for Vulkan (#414) * stdlib fixes for Vulkan - Make sure to emit `image` instead of `texture` for `RWTexture` types - Change `GetDimensions` to call `imageSize` instead of `textureSize` when we use images - Always output a `layout(rgba32f)` for variables that translate to `image` types - TODO: we should emit an appropriate format based on the type, or let the user specify one - Fix GLSL translation for `any()` function (required boolean inputs) - Add GLSL translation for `GroupMemoryBarrierWithGroupSync()` - Map HLSL `groupshared` to GLSL `shared` These together are enough to get the Falor `ComputeShader` example to work. fixup for warning	18 February 2018, 16:08:46 UTC
8c87259	Tim Foley	16 February 2018, 20:00:20 UTC	Implement IR-level translation of system values for GLSL (#413) The approach here isn't ideal. We already have a pass that transforms HLSL varying input/output types into GLSL global variables. This change makes it so that when those inputs/outputs have system-value semantics, we generate a global variable declaration with the appropriate `gl_` name (leaving the type the same for now). Later, when emitting code, we just skip emitting declarations for declarations with mangled names that start with `gl_`. A more complete implementation will be needed later on, which handles cases where the translation requires types to be changed (so that conversion code needs to be inserted).	16 February 2018, 20:00:20 UTC
1b93da0	Tim Foley	16 February 2018, 17:04:44 UTC	IR/Vulkan fixes (#412) * Fix bugs around IR legalization of GLSL input/output - Add case to handle assignment of one `ScalarizedVal::Flavor::address` to another (still need to make sure we are handling all the possible cases there) - Revamp logic for creating global variable declarations for varying inputs/outputs. - Actually handle creating array declarations (not sure if binding locations will be correct) - Properly deal with offsetting of locations for nested fields - Only create varying input/output layout information as needed for the separate `in` and `out` variables we create to represent a single HLSL `inout` varying * During SSA generation, recursively remove trivial phis This is actually written up in the original paper I used as a reference, but I hadn't implemented the case yet. When you eliminate one phi as trivial (because its only operands were itself and at most one other value), you might find that another phi becomes trivial (because it had this phi as an operand, but now it will have the other value...). The one thing that made any of this tricky is that our "phi" nodes are really block parameters, and thus they don't technically have operands (`IRUse`s). The `IRUse`s for each phi were being tracked in a separate array, and had their `user` field set to null. With this change, I set their `user` to be the corresponding `IRParam` for the phi (and that means I changed `IRParam` to inherit from `IRUser` even though it shouldn't really be required). * Re-build SSA form after specialization/legalization The main reason to do this is that legalization might scalarize types, and thus might allow us to clean up resource-type local variables that we were not able to clean up when they were part of an aggregate. Note: we shouldn't really need to do this, because the front-end should actually be guaranteeing that types that include resources are used in "safe" ways, but we currently don't have the analyses required to support that. * Give an error message if we get GLSL input The API and command-line interface still recognize and nominally support GLSL input files, because they need to be supported in the "pass-through" mode. This change just adds an error message if we encounter a GLSL input file in anything other than "pass-through" mode.	16 February 2018, 17:04:44 UTC
3254970	Tim Foley	13 February 2018, 18:22:54 UTC	Fix a bug in IR use-def information (#406) The basic problem here is that when unlinking an `IRUse` from the linked list of uses, there were several cases where I was failing to set the `prevLink` field of the next node to match the `prevLink` field of the node being removed. That doesn't show up when walking the linked list of uses forward, but it breaks it whenever you have subsequent unlinking operations. This change fixes the bugs of that kind I could find, and also adds a debug validation method to try to avoid breaking it again. I also made more access to `IRUse` go through accessor methods rather than using fields directly, to try to avoid this kind of error. I stopped short of making anything `private`, because I tend to find that it creates more hassles than it avoids. A few other fixes along the way: - Made the `List<T>` type default-initialize elements when you resize it. I hadn't realized we weren't doing that. - Add a standalone `dumpIR(IRGlobalValue*)` so help when debugging issues.	13 February 2018, 18:22:54 UTC
214a1fc	Tim Foley	10 February 2018, 05:27:26 UTC	Handling of duplicate global shader parameter declarations (#405) * Issue error when shader parameter type doesn't match between translation units We currently allow users to compile different translation units at the same time for the vertex and fragment shader, and those different translation units might contain distinct declarations for the "same" parameter. Furthermore, those declarations might use distinct declaratins of the "same" type. We currently bind those as if they were one parameter, which means we assume they have types that are actually equivalent. Unfortunately, things break when that isn't the case. This change adds error messages when parameter declarations that are determined to be the "same" don't have types that are either equiavlent or match "structurally" (same names, same fields, and field types match recursively). Ideally most users won't see these errors, because they will only submit single files to the compiler. Eventually we may actually decide to require that case and simplify our logic. * Support types with layout coming from a "matching" type Because of how the front-end performs parameter binding, we can end up in cases where a variable is given a `TypeLayout` based on a "matching" type from another translation unit (because the "same" shader parmaeter got declared in multiple TUs). This change tries to support that use case by avoiding absolute field decl-refs in the representation used for type legalization, so that we don't end up in a situation where we look up field layout based on information that doesn't match.	10 February 2018, 05:27:26 UTC
c7c97ad	Tim Foley	08 February 2018, 22:46:12 UTC	Basic IR support for `static const` globals (#404) * Basic IR support for `static const` globals Our strategy for lowering global variables can fall back to putting their initialization into a function, but that isn't really appropriate for global constants (it also isn't appropriate for arrays, but we'll need to deal with that seaprately). This change adds a distinct case for global constants (rather than treating them as variables), and forces the emission logic to always emit them as a single expression. Doing this makes assumptions about how the IR for these constants gets emitted (and what optimziations might do to it). In order to make things work, I had to switch the handling of initializer-list expressions to not be lowered via temporaries and mutation (since that isn't a good fit for reverting to a single expression). I've added a single test case to ensure that this works in the simplest scenario. My next priority will be to see if this unblocks my work in Falcor. * Fixup: bug fixes	08 February 2018, 22:46:12 UTC
112caca	Tim Foley	08 February 2018, 15:54:04 UTC	Falcor fixes (#402) * Re-define deprecated compile flags By including these flags in the header file, with a value of zero, we can allow some existing code to compile even after the major changes to the implementation. * The `SLANG_COMPILE_FLAG_NO_CHECKING` option will effectively be ignored, since checking is always enabled. * The `SLANG_COMPILE_FLAG_SPLIT_MIXED_TYPES` option will now act as if it is always enabled (and indeed some of the code has been relying on this flag being set always). * Make subscript operators writable for writable textures This even had a `TODO` comment saying that we needed to fix it, and now I'm seeing semantic checking failures because we didn't define these and so we find assignment to non l-values. * Fix definitions of any() and all() intrinsics These should always return a scalar `bool` value, but they were being defined wrong in two ways: 1. They were using their generic type parameter `T` in the return type 2. They were returning a vector in the vector case, and a matrix in the matrix case. This change just alters the return type to be `bool` in all cases. * Fix bug in SSA construction When eliminating a trivial phi node, it is possible that the phi is still recorded as the "latest" value for a local variable in its block. When later code queries that value from the block (which can happen whenever another block looks up a variable in its predecessors), it would get the old phi and not the replacement value. I simply added a loop that checks if the value we look up is a phi that got replaced, and then continues with the replacement value (which might itself be a phi...). A more advanced solution might try to get clever and have the map itself hold `IRUse` values so that we can replace them seamlessly. * Simplify IR control flow representation This change gets rid of various special-case operations for conditional and unconditional branches, and instead requires emit logic to recognize when a direct branch is targetting a `break` or `continue` label. The new approach here isn't perfect, but it seems beter than what we had before, because it can actually work in the presence of control-flow optimizations (including our current critical-edge-splitting step). * Load from groupshared isn't groupshared When loading from a `groupshared` variable, the resulting temporary shouldn't have the `groupshared` qualifier on it. This might eventually need to generalize to a better understanding of storage modifiers in the IR, but I don't really want to deal with that right now. * Don't emit references to typedefs in output code Now that we are using the IR for all codegen, we shouldn't be dealing with surface-level things like `typedef` declarations in the output code; just use the type that was being referred to in the first place. * Fix floating-point literal printing for IR The IR was calling `emit()` instead of `Emit()` (we really need to normalize our convention here), and was implicitly invoking a default constructor on `String` that takes a `double` (that constructor should really be marked `explicit`), and which doesn't meet our requirements for printing floating-point values. * Fix error when importing module that doesn't parse We already added a case to bail out if semantic checking fails, but neglected to add a case if there is an error during parsing of a module to be imported. Note: this logic doesn't correctly register the module as being loaded (but still in error), so users could see multiple error messages if there are multiple `import`s for the same module. * Improve error message for overload resolution failure - Drop debugging info from the candidate printing - Add cases to print `double` and `half` types properly * Fixup: switch loopTest to ifElse in expected IR output	08 February 2018, 15:54:04 UTC
be8b891	Tim Foley	07 February 2018, 22:37:37 UTC	Generate SSA form for IR functions (#400) * Generate SSA form for IR functions The basic idea here is simple: in the front-end after we have lowered the AST to initial IR we will apply a set of "mandatory" optimization passes. The first of these is to attempt to translate the all functions into SSA form so that they are amenable to subsequent dataflow optimizations. Eventually, the mandatory optimization passes would include diagnostic passes that make sure variables aren't used when undefined, etc. Just doing basic SSA generation already cleans up a lot of the messiness in our IR today, because constructs that used to involve many local variables can now be handled via SSA temporaries. The implementation of SSA generation is in `ir-ssa.cpp`, and it follows the approach of Braun et al.'s "Simple and Efficient Construction of Static Single Assignment Form." I used this instead of the more well-known Cytron et al. algorithm because Braun's algorith mis very simple to code, and does not require auxiliary analyses to generate the dominance frontier. The main wrinkle in our SSA representation right now is that instead of using ordinary phi nodes, we instead allow basic blocks to have parameters, where predecessor blocks pass in different parameter values. This encodes information equivalent to traditional phi nodes, but has two (small) benefits: 1. There is no fixed relationship between the order of phi operands and predecessor blocks, so we don't have to worry about breaking the phis when we alter the order in which predecessors are stored. This is important for us because predecessors are being stored implicitly. 2. It is easy to operationalize a "branch with arguments" either when lowering to other languages, or when interpreting the IR. A branch with arguments is implemented as a sequence of stores from the arguments to the parameters of the target block (very similar to a call), followed by a jump to the block. Relevant to the above, this change also adds an interface for enumerating the predecessors or successors of a block in our CFG. Rather than use an auxliary structure, we directly use the information already encoded in the IR: * The sucessors of a block are the target label operands of its terminator instruction. In our IR this is a contiguous range of `IRUse`s, possible with a stride (to account for the way `switch` interleaves values and blocks). * The predecessors of a block are a subset of the uses of the block's value. Specifically, they are any uses that are on a terminator instruction, and within the range of values that represent the successor list of that instruction. One important limitation of the "blocks with arguments" model for handling phis is that it is really only convenient to stash extra arguments on an unconditional terminator instruction. This change works around this prob lem by breaking any "critical edges" - edges between a block with multiple successors and one with multiple predecessors. We assume that "phi" nodes will only ever be needed on a block with multiple predecessors, and because critical edges are broken, each of these predecessors will then have only a single successor, so its branch instruction can handle the extra arguments. This change introduces a notion of an "undefined" instruction in the IR. This is handled as an instruction rather than a value because I anticipate that we will want to distinguish different undefined values when it comes time to start issuing error messages (those messages will need to point to the variable that was used when undefined). * Fix expected test output. Another change was merged that enabled the `glsl-parameter-blocks` test, and its output is affected by our IR optimization work.	07 February 2018, 22:37:37 UTC
1fbc73d	Tim Foley	07 February 2018, 21:41:43 UTC	Support __target_intrinsic modifiers in IR codegen (#401) The standard library already has a bunch of these decorations, since they were added to support Slang->Vulkan codegen on the AST-to-AST path. This change makes the IR code generator able to exploit the modifiers so that we pick up a bunch of Vulkan support "for free" in the short term. The basic change is in `lower-to-ir.cpp` where we copy over any `TargetIntrinsicModifier`s to become `IRTargetIntrinsicDecoration`s with the same information. We then need a bit of logic in `ir.cpp` to make sure we clone them as needed. The core work of using the modifiers is in `emit.cpp`, where I basically just copy-pasted the existing logic that applied in the AST path (all the AST-related code there is dead, and we should clean it up soon). The big change that comes with this logic is that when dealing with a member function, the numbering of the argument used in the intrinsic definition string changes, so that `$0` refers to the base object (whereas before the base object was looked up via the base expression of a `MemberExpr` used for the function). This requires a bunch of the definitions in the library to be updated; hopefully I caught them all. For kicks, I've re-enabled a cross-compilation test just to confirm that we are generating valid SPIR-V for code that performs texture-fetch operations. I don't expect us to keep that test enabled as-is in the long term, though, because it would be much better to instead use render-test to do the same thing. Alas, beefing up the Vulkan support in render-test is an outstanding work item, and I didn't want to pollute this change with more work along those lines.	07 February 2018, 21:41:43 UTC
662f43f	Tim Foley	03 February 2018, 15:30:54 UTC	Remove non-IR codegen paths (#398) The basic change is simple: remove support for all code generation paths other than the IR. There is a lot of vestigial code left, but the main logic in `ast-legalize.` is gone. Doing this breaks a lot* of tests, for various reasons: - We can no longer guarantee exactly matching DXBC or SPIR-V output after things pass through out IR - Many builtins don't have matching versions defined for GLSL output via IR (even when they had versions defined via the earlier approach that worked with the AST) - A lot of code creates intermediate values of opaque types in the IR, which turn into opaque-type temporaries that aren't allowed (this breaks many GLSL tests, but also some HLSL) I implemented some small fixes for issues that I could get working in the time I had, but most of the above are larger than made sense to fix in this commit. For now I'm disabling the tests that cause problems, but we will need to make a concerted effort to get things working on this new substrate if we are going to make good on our goals.	03 February 2018, 15:30:54 UTC
58475a8	Tim Foley	02 February 2018, 18:38:22 UTC	Revamp documentation (#395) - Remove references to building by embedding source (not recommended at this point) - Push users more toward binary builds rather than building from source (but include a document that talks about how to build) - Remove most (all?) references to supporting GLSL input - Expand the language guide to talk about the new features - Add a document that talks about the parameter layout algorithm	02 February 2018, 18:38:22 UTC
0360f81	Tim Foley	02 February 2018, 16:49:04 UTC	Remove support for the -no-checking flag (#392) * Remove support for the -no-checking flag Fixes #381 Fixes #383 Work on #382 - No longer expose flag through API (`SLANG_COMPILE_FLAG_NO_CHECKING`) and command-line (`-no-checking`) options - Remove all logic in `check.cpp` that was withholding diagnostics (including errors) when the no-checking mode was enabled - Remove `HiddenImplicitCastExpr`, which was only created to support no-checking mode (it represented an implicit cast that our checking through was needed, but couldn't emit because it might be wrong) - Remove logic for storing function bodies as raw token lists when checking is turned off. I'm leaving in the `UnparsedStmt` AST node in case we ever need/want to lazily parse and check function bodies down the line. - Remove a few of the code-generation paths we had to contend with, but keep the comment about them in place. - Remove GLSL-based tests that can't meaningfully work with the new approach. - Fix other tests that used a GLSL baseline so that their GLSL compiles with `-pass-through glslang` instead of invoking `slang` with the `-no-checking` flag. - Remove tests that were explicitly added to test the "rewriter + IR" path, since that is no longer supported. There is more cleanup that can be done here, now that we know that AST-based rewrite and IR will never co-exist, but it is probably easier to deal with that as part of removing the AST-based rewrite path. We've lost some test coverage here, but actually not too much if we consider that we are dropping GLSL input anyway. * Fixup: test runner was mis-counting ignored tests * Fixup: turn on dumping on test failure under Travis * Fixup: enable extensions in Linux build of glslang	02 February 2018, 16:49:04 UTC
b034398	Tim Foley	02 February 2018, 15:49:32 UTC	Initial work on getting render-test to support vulkan (#391) * Basic fixes to gets some Vulkan GLSL out of the IR path We haven't been paying much attention to the Vulkan output from the IR path, but that needs to change ASAP. This commit really just implements quick fixes, without concern for whether they are a good fit in the long term. - Add some more mappings from D3D `SV_` semantics to built-in GLSL variables, and stop redeclaring those built-in variables in our output GLSL. - Add custom output logic for HLSL `StructuredBuffer<T>` types, so that they emit as `buffer` declarations with an unsized array inside. This has some real limitations: - What if the user passes the type into a function? The parameter should be typed as an (unsized) array, and not a buffer. - What happens if we have an array of structured buffers? We need to declare an array of blocks (which GLSL allows), but this changes the GLSL we should emit when indexing. - Customize the way that we emit entry point attributes (e.g., `[numthread(...)]`) to also support outputting equivalent GLSL `layout` qualifiers. In many of these cases, a better fix might involve doing more of this work in the IR as part of legalization (e.g., we already have a pass that deals with varying input/output for GLSL, so that should probalby be responsible for swapping the `SV_` to `gl_`, especially in cases where the types don't match perfectly across langauges). * Start adding Vulkan support to render-test - Add both Vulkan and D3D12 as nominally supported back-ends - Add a git submodule to pull in the Vulkan SDK dependencies - I don't want our users to have to install it manually, since the SDK is huge - Checking in the binaries to our main repository seems like a bad idea, but my hope is that we can prune the bloat using a subodule with the `shallow` cloning option - Implement enough logic for the Vulkan back-end to get a single test passing on Vulkan * Fix warning * Fixup: disable new compute tests for Linux * Fixup: ignore Vulkan tests on AppVeyor * Dynamically load Vulkan implementation Rather than statically link to the Vulkan library, we will dynamically load all of the required functions. This removes the need to have the stub libs involved at all. * Remove vulkan submodule I had set up a `vulkan` submodule to pull in the headers and stub libs, but now that we are going to dynamically load all the symbols anyway, the stub lib binaries aren't needed and we can just commit the headers. * Add Vulkan headers to external/	02 February 2018, 15:49:32 UTC
652a3c9	Tim Foley	01 February 2018, 22:41:34 UTC	Fix a bug in import handling (#394) The recent change that removed `#import` accidentally introduced a regression that made any code that imports the same module in more than one place fail. I'm just fixing the bug for now to unblock users, but this should really get a regression test.	01 February 2018, 22:41:34 UTC
4583e39	Tim Foley	01 February 2018, 20:06:06 UTC	Implement type splitting for raw buffers (#393) * Fix render-test to handle raw buffers I don't know if this fix will work for UAVs that are neither structured nor raw, but it fixes the code that currently only really works if every UAV is structured (since it doesn't set a format). * Make type legalization consider raw buffer types The type layout logic was already handling these, but the type splitting logic in legalization was failing to split structure types that contain, e.g., `RWByteAddressBuffer`. A compute test case has been added to confirm the fix.	01 February 2018, 20:06:06 UTC
b6bc083	Tim Foley	30 January 2018, 00:32:52 UTC	Remove #import directive (#389) Fixes #380 The `#import` directive was a stopgap measure to allow a macro-heavy shader library to incrementally adopt `import`, but it has turned out to cause as many problems as it fixes (not least because users have never been able to form a good mental model around which kind of import to do when). This change yanks support for the feature.	30 January 2018, 00:32:52 UTC
06f0eff	Tim Foley	26 January 2018, 23:30:23 UTC	Fix handling of errors in imported modules (#387) * Fix handling of errors in imported modules - If a semantic error is detected in an imported module, then don't try to generate IR code for it - Also, if a module (transitively) imports itself, then report that as an error - The way I'm checking for this is a bit hacky (I'm adding the module to the map of loaded modules, but in an "unfinished" state, and then using that unfinished state to detect the import of a module already being imported). This isn't a 100% complete solution for any of the related problems, but it improves the user experience for the common case. * Remove #import test. The feature is slated to be removed, so it isn't worth fixing up this test case.	26 January 2018, 23:30:23 UTC
a050f4a	Tim Foley	26 January 2018, 21:23:16 UTC	Fix some crashing bugs around local variable declarations. (#385) The basic problem here arises when a local variable is used either before its own declaration: ```hlsl int a = b; ... int b = 0; ``` or when a local variable is used in its own decalration: ```hlsl int b = b; ``` In each case, Slang considers the scope of the `{}`-enclosed function body (or nested statement) as a whole, and so the lookup can "see" the declaration even if it is later in the same function. This behavior isn't really correct for HLSL semantics, so the right long-term fix is to change our scoping rules, but for now users really just want the compiler to not crash on code like this, and give an error message that points at the issue. This change makes both of the above examples print an error message saying that variable `b` was used before its declaration, which is accurate to the way that Slang is interpreting those code examples. This is currently treated as a fatal error, so that compilation aborts right away, to avoid all of the downstream crashes that these cases were causing.	26 January 2018, 21:23:16 UTC
4cd18ec	Yong He	23 January 2018, 21:04:45 UTC	Update README.md	23 January 2018, 21:04:45 UTC
94fd60e	Yong He	22 January 2018, 02:25:05 UTC	Merge pull request #379 from tfoleyNV/generic-extension-fixes Generic extension fixes	22 January 2018, 02:25:05 UTC
f114c37	Tim Foley	22 January 2018, 02:03:35 UTC	A hacky fix for specializing methods from extensions If we don't find the generic we expect in the first pass during IR specialization, then we check for the special case where we are trying to specialize something from a generic extension, using the type being extended. We assume that the generic parameter lists match (that part is the huge hack), and collect the arguments as if they were for the extension instead of the type. This will break when/if we ever have generic extensions with parameter lists that don't match the type being extended.	22 January 2018, 02:06:40 UTC
ef2f92f	Tim Foley	22 January 2018, 01:23:03 UTC	Trying to get generic extensions to work - Don't drop specializations on a method when adding it to requirement dictionary - Handle extension declarations under a generic when emitting to IR	22 January 2018, 02:06:40 UTC
4131120	Tim Foley	22 January 2018, 02:05:39 UTC	Fix legalization of generic types (#377) Previously, all legalizations of a generic type would use the name of the original decl for the "ordinary" part of things, and this would lead to collisions because the names didn't include the mangled generic arguments. This is now fixed by storing the mangled name of the original inside of `struct` declarations created for legalization, and using those names instead. Also adds support for `getElementPtr` instructions when doing IR type legalization. Also tries to make a `DeclRefType` convert to a string using the underlying `DeclRef`. This doesn't help because `DeclRef::toString` doesn't actually include generic arguments either.	22 January 2018, 02:05:39 UTC
8196dc4	Yong He	22 January 2018, 00:26:52 UTC	specialize witness tables when needed when specializing `lookup_witness_table` instruction. (#376)	22 January 2018, 00:26:52 UTC
4044a1d	Yong He	21 January 2018, 18:48:31 UTC	Merge pull request #372 from csyonghe/master Allow type expression as type argument, fix global param enum order	21 January 2018, 18:48:31 UTC
f681a15	Tim Foley	21 January 2018, 17:45:58 UTC	Add directive to ignore file for test runner	21 January 2018, 17:45:58 UTC
ce87911	Yong He	21 January 2018, 12:09:55 UTC	Improvements and bug fixes for global type parameters 1. allow spReflection_FindTypeByName to accept arbitrary type expression string 2. allow const int generic value to be used as expression value, and as array size 3. various bug fixes in witness table specialization / function cloning during specializeIRForEntryPoint to avoid creating duplicate global values, not copying the right definition of a function from the other module, not cloning witness tables that are required by specializeGenerics etc.	21 January 2018, 12:09:55 UTC
913f4d0	Yong He	20 January 2018, 07:48:12 UTC	bug fixes fixes #373 fixes bug that misses current translation unit's scope when resolving entry-point global type argument expression.	20 January 2018, 07:48:12 UTC
6dd64fd	Yong He	20 January 2018, 03:20:29 UTC	Make specialization presserve global parameter enumeration order in reflection data	20 January 2018, 03:21:29 UTC
9d515dd	Yong He	19 January 2018, 23:20:23 UTC	Allow arbitrary type string as type argument in spAddEntryPointEx.	20 January 2018, 03:21:29 UTC
2079b94	Yong He	18 January 2018, 20:49:59 UTC	Merge pull request #371 from csyonghe/master All compiler fixes to get ir branch work with falcor feature demo.	18 January 2018, 20:49:59 UTC

Newer
Older