Revision history - None - origin: https://github.com/shader-slang/slang

visit type:

Revision	Author	Date	Message	Commit Date
348058f	Dietrich Geisler	27 July 2020, 16:14:17 UTC	Baseline Heterogeneous Example (#1460) * Baseline Heterogeneous Example This PR introduces a baseline heterogeneous example, including both a Slang file and an associated C++ helper file. This refactoring primarily moves the Slang file "into the driver's seat" while maintaining that the C++ side still does most of the actual work. * Fix to prelude path	27 July 2020, 16:14:17 UTC
87940a6	Tim Foley	25 July 2020, 01:12:41 UTC	Fix bugs related to mutating implementations of interface methods (#1461) There are two main bug fixes here: * We were failing to diagnose when code calls a `[mutating]` method on a value that doesn't support mutation (that is an r-value instead of an l-value). * We had a bug in the synthesis logic for interface requirements where we used the result type of the requirement in place of each of the parameter types. The second bug made synthesis often produce incorrect signatures with `void` parameters. The first bug meant that even though a `[mutating]` method should not be able to satisfy a non-`[mutating]` method (and we had code to enforce this for the "exact match" case), when we go on to try and synthesize a non-`[mutating]` method that satisfies the requirement by delegating to the user-written one, it would end up succeeding, because nothing was stopping a non-`[mutating]` method from calling a `[mutating]` one. In each case this code adds a fix and a test case to confirm it.	25 July 2020, 01:12:41 UTC
261fe75	Yong He	24 July 2020, 23:37:51 UTC	Ensure labels are dumped in `lower-to-ir` (#1459) * Ensure labels are dumped in `lower-to-ir`. There is a `dumpIR` function that accepts a label parameter already in slang-emit.cpp. This change moves it to slang-ir.cpp so it may be called from other files. * update expected test result Co-authored-by: Yong He <yhe@nvidia.com> Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	24 July 2020, 23:37:51 UTC
17d0da2	jsmall-nvidia	24 July 2020, 20:41:43 UTC	Enable CUDA for active-mask tests. (#1458) Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	24 July 2020, 20:41:43 UTC
eaf3f04	Tim Foley	24 July 2020, 19:07:39 UTC	Handle case of no global parameters for CPU/CUDA (#1457) The IR pass that introduces an explicit `KernelContext` for the CPU/CUDA back-ends was also responsible for adding an explicit parameter to the kernel entry point to receive the constant buffer (pointer) with all the global uniform parameters. However, if there were no global uniform parameters, this parameter wasn't getting introduced, which changed the signature/ABI of the generated entry point function. This change makes it so that the pass unconditionally adds a parameter. In the case where there are no global uniforms it just adds a `void*` parameter that never gets used. In order to avoid future regressions, this change also adds a test case to confirm that things work correctly when a kernel has only entry-point parameters and no global parameters.	24 July 2020, 19:07:39 UTC
ef9d76c	Yong He	24 July 2020, 17:18:22 UTC	`InterlockedAdd` CPU intrinsic implementation. (#1455) Co-authored-by: Yong He <yhe@nvidia.com> Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	24 July 2020, 17:18:22 UTC
cb0a08b	jsmall-nvidia	24 July 2020, 15:12:58 UTC	Test frame work improvements (#1452) * Add -hide-ignored Made API filter when enbled filter out non API tests. * Add ability to set categories at file level. Added wave, wave-mask and wave-active categories. * Added -api-only flag. * Don't synthesize tests from only CPU tests. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	24 July 2020, 15:12:58 UTC
7e952cd	Dietrich Geisler	24 July 2020, 05:50:53 UTC	CPU/GPU Compute Shader Example (#1451) * CPU/GPU Compute Shader Example This PR introduces an example to run a simple compute shader on the GPU in the heterogeneous-hello-world example. All loading code is currently run in C++, so the heterogeneity of this example is still a work in progress. This change updates exactly this example, and so should not cause issues elsewhere in the codebase. * Small fix * Added gfx to help the linker * Added back the struct * Updated premake to respect windows conditions * Completely removed het-example * Re-added example Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	24 July 2020, 05:50:53 UTC
61be38f	Yong He	23 July 2020, 22:33:04 UTC	Run array specialization in a sperate pass. (#1449) * Run array specialization in a sperate pass. * rename specializeFunctionCall->specializeFunctionCalls Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	23 July 2020, 22:33:04 UTC
fed4292	Yong He	23 July 2020, 20:47:12 UTC	Run SSA pass to clean up temporary variables during generics lowering. (#1447) * Run SSA pass to clean up generic temporary variables during lowering. * Fix `undefined` emitting logic. * revert dumpir control flag * Defer fold decision of `undefined` values after special case logic for GLSL and HLSL. * Update expected test result. * Manually update raygen.slang.glsl to minimize change. * fix formatting Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	23 July 2020, 20:47:12 UTC
e93d3a4	Tim Foley	23 July 2020, 16:20:28 UTC	Fix the way extension declarations are cached for lookup (#1450) During semantic checking, the compiler used to link together `ExtensionDecl`s into a singly-linked list dangling off of the `AggTypeDecl` that they applied to. This approach made lookup relatively easy, because given a `DeclRef` to an `AggTypeDecl` one could easily find and walk the list of candidate extensions. Unfortunately, the simple approach has two major strikes against it: * First, as we recently ran into, it creates a lifetime/ownership problem, in cases where the `ExtensionDecl` is outlived by the `AggTypeDecl` it applies to. This creates the one and only place in the compiler today where an "old" AST node might point to a "new" AST node, and it resulted in use-after-free problems in client code. * Second, the scoping of `extension`s ends up being completely wrong. All of the `extension` methods on a type end up being visible in all cases, instead of just in the context of modules where the `extension` itself is visible. The comparable feature in C# (static extension methods) is careful to not make scoping mistakes like this. The Swift langauge has loose scoping for `extension` more akin to what we have in Slang today, but the maintainers seem to consider it a misfeature. This change attempts to clean up both issues by changing the way that extension declarations are stored. There are two main pieces: 1. The primary "source of truth" for extension lookup has been moved to the `ModuleDecl`, where a module is responsible for storing a cache of the extensions declared within that module (keyed by the declaration of the type being extended). This cache is updated at the same point where the old code would mutate the AST node being depended on. 2. A secondary aggregated cache is added to the `SharedSemanticsContext` used during semantic checking. This cache includes entries from across multiple modules, and is intended to be invalidated and rebuilt on demand if new modules are added during checking. Access to the candidate extensions has now been put behind subroutines that require a semantics-checking context to be passed in (there was always one available in contexts that care about extensions). In addition, the operation for looking up members including those from extensions was refactored heavily to involve internal rather than external iteration and, more importantly, was changed so that it actually tests whether the `ExtensionDecl`s it loops over apply to the type in question, rather than blindly letting extensions members be looked up in ways that don't make sense. There are three test cases added here to confirm aspects of the fix: * First, I added a test that reproduces the crash that was being seen, so that we have a regression test for the fix. * Second, I added a basic semantic-checking test to confirm that an `extension` from an `import`ed module is still visible/usable, to confirm that I didn't break existing valid uses of extensions. * Third, I added a diagnostic test that ensures that we correctly ignore extensions that should not be visible in a given context as a result of `import` declarations. Co-authored-by: jsmall-nvidia <jsmall@nvidia.com>	23 July 2020, 16:20:28 UTC
cf50355	jsmall-nvidia	23 July 2020, 13:37:58 UTC	Fix for vulkan tests failing (#1456) * Clean up device when VKRenderer dtor is run. Added destroy methods to VulkanSwapChain & VulkanDeviceQueue * Small fixes around testing if DeviceQueue is valid. * Disable active-mask tests. Different drivers appear to change the results.	23 July 2020, 13:37:58 UTC
1159204	Dietrich Geisler	20 July 2020, 18:53:23 UTC	Multiple Entry Point Backend (#1437) * Multiple Entry Point Backend This PR introduces changes to the IR linking, emitting, and options for multiple entry points. Specifically, this PR updates several locations to support a (potentially empty) list of entry points, adding list infrastructure and looping over entry points as appropriate. * Formatting change * Updated unknown target case to not require an entry point * Formatting and list consts updates Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	20 July 2020, 18:53:23 UTC
975c5db	jsmall-nvidia	17 July 2020, 20:38:18 UTC	Disable specializing function calls if they have a struct param, that contains an array (#1448) * This code is disabled, it was part of the optimization `Specialize function calls involving array arguments. (#1389)` on github. It is disabled here because it causes a problem when a struct is passed to a function that contains a structured buffer and an array. It is specialized on the struct type, and so those types become parameters to the function. If the struct contains a structured buffer this is a problem on GLSL/VK based targets because currently structured buffers cannot be function parameters. The fix for now is to just disable this optimization. * Fix typo in name of test expected values.	17 July 2020, 20:38:18 UTC
ee75558	jsmall-nvidia	16 July 2020, 20:55:31 UTC	Running generators as separate premake project step (#1441) * Put the running of generators into a separate project, to try and sure the generated products are available for other dependencies when compiling with multiple threads on linux. * Made paths Strings in slang-generate. Made paths use / for path separators (rather than \ on windows which causes some problems with #line). * Make the run-generators proj a utility step. * Made run-generators a StaticLib. * Fix problem with generating when not necessary. * Trying to get abspath to work on linux. * Add run-generator-main.cpp dummy file. * Add comment about the issues around linux and correct build triggering. * Add updated projects. * Remove the run-generators-main.cpp as no longer needed for 'run-generators' tool. Removed the adding of files by default from baseSlangProject Made the run generators project use slang-string.cpp as the file it builds from core. * Add the run-generators VS project.	16 July 2020, 20:55:31 UTC
62079c5	Yong He	16 July 2020, 20:09:17 UTC	Support associatedtype local variables and return values in dynamic dispatch code (#1444) * Refactor lower-generics pass into separate subpasses. * IR pass to generate witness table wrappers. * Support associatedtype local variables and return values in dynamic dispatch code.	16 July 2020, 20:09:17 UTC
5758d16	Yong He	15 July 2020, 19:48:56 UTC	IR pass to generate witness table wrappers. (#1443) * Refactor lower-generics pass into separate subpasses. * IR pass to generate witness table wrappers. * Re-generate vs project files. * Fix x86 build error.	15 July 2020, 19:48:56 UTC
e9d5ecb	Yong He	15 July 2020, 18:39:11 UTC	Refactor lower-generics pass into separate subpasses. (#1442)	15 July 2020, 18:39:11 UTC
723c9b1	Tim Foley	15 July 2020, 16:31:27 UTC	Remove KernelContext wrapper from CPU/CUDA emit (#1440) * Remove KernelContext wrapper from CPU/CUDA emit Currently, the CPU and CUDA C++ targets rely on a `KernelContext` type that is generated during emit, as a way to provide implicit access to things that were global in the input Slang code, but that can't actually be emitted as globals in the target language (because the semantics of global declarations differ). For example, input like: ```hlsl ConstantBuffer<Stuff> gStuff; // shader parameter groupshared int gData[1024]; // thread-group shared variable static int gCounter = 0; // "thread-local" global-scope variable void subroutine() { ... } [shader("compute")] void computeMain() { ... } ``` would translate to output C++ for CPU a bit like: ```c++ struct KernelContext { ConstantBuffer<Stuff> gStuff; int gData[1024]; int gCounter = 0; void subroutine() { ... } void computeMain() { ... } }; ``` Note that both `computeMain()` and `subroutine()` are non-`static` members functions on `KernelContext`, so they have an implicit `this` parameter of type `KernelContext`, which allows the bodies of those functions to implicitly reference `gStuff`, etc. by name in their bodies. Because `KernelContext::computeMain()` is a member function, we end up emitting an additional global-scope function to expose the entry point to the outside world, and that function is responsible for declaring a local `KernelContext` and invoking the generated entry point on it. This approach has several important drawbacks: * It complicates the emit logic for CPU and CUDA, with many special cases around when/how things get emitted * It complicates the implementation of dynamic dispatch, because what seems like a function pointer in Slang IR needs to be a pointer-to-member-function in C++. * It makes it difficult to have a non-kernel-oriented mode of compilation for CPU where a Slang function with a given signature gets output as a C++ CPU function with the "same" signature (not wrapped up as a member function of `KernelContext`. This change makes a step toward addressing these issues by making the introducing of the `KernelContext` type be something that is done in an explicit IR pass instead of being handled as part of the last-mile emit logic. The most important change is the removal of code related to `KernelContext` from the `slang-emit-{cpp,cuda}.{h,cpp}` files, with the equivalent logic instead being handled in a new pass in `slang-ir-explicit-global-context.{h,cpp}`. It should be noted that further cleanups to the emit logic should now be possible; in particular, both the CPU and CUDA emit paths are manually sequencing the `EmitAction`s instead of relying on the default logic, but at this point they should be able to just use the default. The additional cleanups are left for future work. The explicit IR pass does more or less what one would expect: it identifies global-scope entities (global variables and parameters) that need to be wrapped and turns them into fields of a `KernelContext` type. It then modifies all entry points to initialize a `KernelContext` as part of their startup. Finally, any code that used to refer to the global entities is changed to refer to a field of the context, with the context passed via new function parameters (the new parameter is only added to functions that need it for now). Transforming global variables into fields of a `KernelContext` type in the IR pass ends up dropping their initial-value expressions (since those were attached as basic blocks on the `IRGlobalVar`). To avoid breaking code that relies on global-scope (but thread-local) variables, this change also adds an explicit pass that takes the initialization logic on all global variables and moves it to explicit logic that runs at the start of every entry point in a linked module (`slang-ir-explicit-global-init.{h,cpp}`). This pass would also be useful when we get back to direct SPIR-V emit, since SPIR-V also requires initialization logic for globals to be emitted into entry points. One complication that arises when the IR is introducing the types for entry-point parameters, global-scope parameters, and the `KernelContext` type is that it becomes harder for the emit logic to utter the names of those types (they might not even have names, since `IRNameHint`s might get stripped). This created a problem since the wrapper operations that were being generated for CPU were taking `void` parameters and casting them to the appropriate type. To work around this issue, we have added an explicit IR pass (`slang-ir-entry-point-raw-ptr-params.{h,cpp}`) that transforms the signature of entry points so that any pointer parameters instead become raw pointer (`void`) parameters, with the casting being handled inside the entry point itself. One consequence of all the above changes is that for the CUDA target we no longer need a wrapper function to invoke the generated entry point any more, because the IR function for the entry point ends up having the correct/expected signature already. This is also the case for CPU when it comes to the `_Thread` wrapper function, but this change doesn't try to eliminate the wrapper because of a belief that the `_Thread`-level interface is going away anyway. Because the IR is now responsible for ensuring the signature of the IR entry point for CUDA and CPU is what is expected, I needed to modify the `slang-ir-entry-point-uniforms` pass to always create an explicit parameter for the entry point uniforms when compiling for CUDA/CPU, even if there were no `uniform` parameters on the entry point as written. This also ended up requiring some tweaks to the parameter layout logic to ensure that CPU/CUDA targets always treat `ConstantBuffer<T>` as a `T` even in the case where `T` is an empty `struct` type (which happens when we construct a `struct` type to represent the uniform parameters of an entry point with no uniform parameters...). There are several future changes that can/should build on this work: We should change the generated signatures for CUDA kernels, so that they don't rely on `KernelContext` for global-scope parameters. At that point we can avoid generating a `KernelContext` at all for CUDA, except when a program uses global-scope thread-local variables. * We should figure out how to make the "ABI" for dynamic-dispatch calls ensure that the kernel context is either always passed, or always not passed. Making a hard-and-fast rule as part of the calling convention for dynamic calls would ensure that they access through the context continues to work with dynamic calls (this change might break it in some cases). * We should figure out how to handle the layout for the `KernelContext` in cases where a program is composed of multiple separately-compiled modules. Right now the layout of the `KernelContext` requires global knowledge (as does the pass that introduces explicit initialization for global-scope thread-locals). * We should try to further clean up the CPU/CUDA C++ emit logic to fall back on the default emit behavior more, now that the various special-case approaches that were taken are no longer needed * fixup: restore build files to default configuration	15 July 2020, 16:31:27 UTC
48f26ef	Yong He	13 July 2020, 22:16:09 UTC	Dynamic code gen for functions returning generic types. (#1439) * Dynamic code gen for functions returning generic types. * Add expected test result.	13 July 2020, 22:16:09 UTC
249f48d	Tim Foley	10 July 2020, 21:30:57 UTC	CUDA/CPU varying compute inputs as IR pass (#1438) The main change here is that the CPU and CUDA C++ emit paths now rely on an earlier IR pass to legalize the varying parameter list of a kernel and translate references to varying parameters with semantics like `SV_DispatchThreadID`. Doing so removes a lot of special-case logic from the emit passes. This work moves us even closer to being able to eliminate `KernelContext` from the CPU/CUDA emit logic, because it removes the issue of state related to varying inputs being stored in `KernelContext`. The new pass that handles the legalization is in `slang-ir-legalize-varying-params.cpp`, and it borrows heavily from the existing `slang-ir-glsl-legalize.cpp` pass. The new pass factors out the target-independent and target-dependent logic, so that both CPU and CUDA can share much of the same code despite having very different rules for how the system-value parameters are being provided. An eventual goal is to have the new pass also handle the GLSL case, but doing so requires copying even more logic out of the GLSL-specific pass, and doing so seemed like a step to far for what was meant to be a stepping-stone change as part of other work. As a result of the incomplete nature of the pass, certain cases don't work for compute shader inputs for CPU/CUDA (e.g., wrapping your varying inputs in a `struct` type parameter), but those were cases that also didn't work in the existing `emit`-based logic. One major consequence of this change is that the logic for emitting the various different functions that represent an entry point for our CPU back-end has been streamlined and simplified. The original logic had a fair bit of cleverness built in to try and avoid unnecessary math ops when computing the various IDs/indices, while the new logic is much more simplistic (the main dispatch function loops over threadgroups with a triply-nested `for` and then delegates to the group-level function loops over threads with its own nested `for`s). Longer term, it will be important to simplify the CPU functions we emit further, by eliminating things like the `_Thread` function that should never really be exposed to users (the minimum granularity of invoking a CPU compute kernel should be a single threadgroup). We may eventually decide to synthesize all of the extra code that is being generated in the `emit` pass as IR instead.	10 July 2020, 21:30:57 UTC
6aad38a	Tim Foley	10 July 2020, 18:14:11 UTC	Fix a preprocessor bug affecting X-macros (#1436) * Fix a preprocessor bug affecting X-macros Fixes #1435 This bug exhibited as nondeterministic output from the preprocessor in release builds, but using a debug build it was narrowed down to a use-after-free issue. The core problem is subtle, but relates to how we set up the linked list that represents the "busy" status of macros in a particular expansion environment. Consider this scenario: ```hlsl X(A) ``` The flow we expect from the preprocessor is something like: 1. Read the `X` token in `X(A)` and recognize the start of a function-like macro invocation. Create an expansion environment for `X`, with the global environment as a parent, read in the arguments (just `A`), and push that expansion onto the stack. 2. Read the `M` token that starts the expansion of `M`, and recognize it as an invocation of the object-like macro representing the argument `M`. Create an expansion environment for the definition of `M` (which is just `A`), and push it onto the stack. 3. Read the token `A` from the expansion for the argument `M`, and recognize it as an invocation of the function-like macro `A`. Create an expansion environemnt for `A`, with the current environment as its parent, read in the arguments (just `0`), and push that expansion onto the stack. 4. Read the token `y` from the expansion for `A`, and recognize it as an invocation of the object-like macro representing the argument `y`. Create an expansion environment for the definition of `y` (which is just `0`) and push it onto the stack. 5. Read `0`. 6. Read a bunch of end-of-file tokens that cause all of these expansions to be popped. That all looks fine as written, but the gotcha is that the input stream for the expansion in step (2) is only a single token (`A`), which means that during step (3) the current input stream at the time we create the macro expansion for `A` is at the end of its input, and by the time we've read in the macro arguments that expansion will have been popped. The problem, then, is that the logic for setting up the stack of "busy" macros was being performed at the beginning of the expansion (the part referred to as "create an expansion" above), when it should only have been set up as part of pushing the xpansion onto the stack (since at that point we have a guarantee that the parent expansion cannot be popped until the child expansion has been). The fix here is thus pretty simple: we already have distinct operations for `initializeMacroExpansion()` and `pushMacroExpansion()`, and I simply moved the logic for setting up the "busy" state from the former to the latter. * fixup: typo	10 July 2020, 18:14:11 UTC
2503280	Yong He	10 July 2020, 16:13:50 UTC	Dynamic code gen for generic local variables. (#1434) * Dynamic code gen for generic local variables. * Fixes to function calls with generic typed `in` argument. * Fixes per code review comments	10 July 2020, 16:13:50 UTC
a5a67aa	Yong He	08 July 2020, 20:53:10 UTC	Checkin .clang-format and an example file for discussion (#1373) * Checkin .clang-format and an example file for discussion * Update clang-format settings per discussion comments * update .clang-format * Move .clangformat file to extras/ folder Co-authored-by: Yong He <yhe@nvidia.com> Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	08 July 2020, 20:53:10 UTC
9590948	Tim Foley	08 July 2020, 20:52:40 UTC	Add support for global uniform shader parameters (#1433) * Adding support for global uniform shader parameters This change adds support for Slang programmers to declare shader parameters of "ordinary" types at global scope: ```hlsl uniform float gScaleFactor; void main() { ... = gScaleFactor; ... } ``` The generated HLSL/GLSL/DXIL/SPIR-V output will be something along the lines of: ```hlsl struct GlobalParams { float gScaleFactor; } cbuffer globalParams { GlobalParams globalParams; } void main() { ... = globalParams.gScaleFactor; ... } ``` The binding information used for the implicit `globalParams` constant buffer will be determined by the existing implicit parameter binding logic (which already had support for this kind of transformation). The reason this change is being pursued right now is because it is one step toward removing the implicit `KernelContext` type that is used to wrap the generated code for our CPU and CUDA C++ targets. Handling global-scope parameters of ordinary type requires an IR pass that synthesizes the `GlobalParams` structure type above, and that step ends up removing the need for the similar `UniformState` structure that was being used in the CPU/CUDA emit logic. A more detailed guide to the changes included follows: * The diagnostic for a global-scope variable that is implicitly a shader parameter was kept, but changed to a warning. Users can opt out of the warning by decorating their parameter as a `uniform` (since that keyword is already being used to mark entry-point parameters that should be treated as uniform shader parameters). * To simplify the task of finding the global shader parameters, the `CLikeSourceEmitter` type has been given an `m_irModule` member. The previous emit logic for `UniformState` was having to do a roundabout solution involving the `EmitAction`s to deal with not having direct access to the module. * Removed a few dead declarations in the emit logic (related to a much earlier point where emit was based on the AST instead of the IR). * Made the computation of type names in C++ emit take into account `ConstantBuffer<T>` and `ParameterBlock<T>`. As far as I can tell, these were being handled with some special-case hacks in the emit logic instead of being supported more fundamentally. It might actually be good to pass these through as `ConstantBuffer<T>` and `ParameterBlock<T>` in the C++ output, and allow the prelude to customize their translation (defaulting to defining them as `T`). Removed the special-case C++ emit logic for references to global shader parameters. There are now at most two global shader parameters to deal with, and the default emit logic (referring to them by name) does the Right Thing. * Changed the handling of entry points for C++ (both CPU and CUDA) so that it handles the bundled-up shader paameters for the global and entry-point scopes the same way. The main complication here is OptiX, where parameter data is passed very differently than it is for CUDA compute kernels. * Reverted changes to `ir-entry-point-uniforms` that had made its logic depend on the compilation target. The parameter binding logic was already responsible for deciding if a given target needed to wrap up its entry-point parameters in a constant buffer, and the IR pass was respecting that layout information. The current workaround had been removing the `ConstantBuffer<T>` indirection from this IR pass for CPU/CUDA, but then reintroducing the same indirection later on in the emit step. * Added an explicit IR pass with the task of collecting global-scope parameters of uniform/ordinary type and packaging them up into a `struct`, and then optionally packaging that `struct` up in a constant buffer. This pass bases its decisions on the IR layout information that was already computed, so it should match whatever policy choices were made at the layout level. * Changed the "key" operand on IR `struct` layout information to not assume an `IRStructKey`. The problem here is that the global scope gets a `StructTypeLayout` to represent its members, and this is convenient (rather than having to always special-case logic that handles the global scope), but the "fields" of that struct are global variables which do not have `IRStructKey`s associated with them. The simplest solution is to use the variables themselves as the keys, which required removing the assumption in the IR encoding. * Updated the IR layout process to compute a layout for the global scope of an entire program, and to attach that to the `IRModule` via a decoration. Updated the IR linking process to carry through that decoration to the linked output. This is necessary so that the IR pass that transforms global parameters can access the global-scope layout information. An important concern with this approach is that the contents and layout of the monolithic `GlobalParams` structure depends on the exact set of modules that were linked (and the order in which they were specified, in some cases). This isn't really a new thing with this change, but it becomes more important as we start to think of how to generalize things to better support separate compilation and linking. There are changes that can (and should) be made to the way that IR layouts are computed for programs (e.g., so that we compute layout per-module and then combine them rather than as a whole-program step). In this case, the problem of forming the combined/linked global layout can be moved down the IR level and not be reliant on AST-level information. Just changing the way layout and linking interact would not change the fundamental problem that global shader parameters as they currently exist in Slang/HLSL/GLSL are not readily compatible with true separate compilation. We either need to find a solution strategy that we can apply to allow existing shaders to work with separate compilation or we need to incrementally work toward removing support for global-scope shader parameters in favor of explicit entry-point parameters in all cases. * fixup: missing files * fixup: comment the new code	08 July 2020, 20:52:40 UTC
cfb41bb	Dietrich Geisler	07 July 2020, 21:46:02 UTC	Public Keyword for Functions (#1432) This PR introduces support for the public modifier for functions. This keyword allows labelled functions to be written to the compiled without having a link to an entry point. The goal of this change is to help support heterogeneous design of Slang by permitting C++ code to interact with CPU slang functions. Internally, this PR adds the public decoration to the IR and defines a lowering from the public modifier in the AST to this decoration. Additionally, the Keep Alive decoration is added to any public modifier being lowered, which prevents DCE from eliminating functions labelled with the public keyword. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	07 July 2020, 21:46:02 UTC
f8cc28c	Yong He	07 July 2020, 20:25:47 UTC	Add a test case for dynamic dispatch with `This` type in interface decl. (#1431) * Add a test case for dynamic dispatch with `This` type in interface decl. * Update comments * fix typo in comments Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	07 July 2020, 20:25:47 UTC
1301f6b	Dietrich Geisler	07 July 2020, 15:57:56 UTC	Multiple Entry Point Cleanup (#1427) * Multiple Entry Point Cleanup This commit provides some in-code cleanup of the previous multiple entry point PR (#1411). Specifically, this PR provides refactoring of multiple entry point functions into helper functions, the removal of the EntryPointAndIndex struct, and various stylistic improvements. * Minor updates Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	07 July 2020, 15:57:56 UTC
cf62f13	Yong He	06 July 2020, 18:58:14 UTC	ShortList<T> and core.natvis improvements. (#1430) * ShortList<T> and core.natvis improvements. * Fix gcc build. * add `getBuffer()` accessor to `GetArrayViewResult`	06 July 2020, 18:58:14 UTC
ffd0b9c	Yong He	03 July 2020, 19:37:17 UTC	Emit pointers for CPU target. (#1418) Co-authored-by: Yong He <yhe@nvidia.com>	03 July 2020, 19:37:17 UTC
dfc9100	jsmall-nvidia	02 July 2020, 20:32:16 UTC	Bug fix in C++ extractor (#1429) * Fix bug from change in diagnostics. * Catch exceptions and display a message on problem with C++ extractor.	02 July 2020, 20:32:16 UTC
5fc0185	Tim Foley	02 July 2020, 18:45:59 UTC	Attempt to silence some warnings (#1428) * Attempt to silence some warnings This is an attempt to change code in `slang-ast-serialize.cpp` so that it doesn't trigger a warning(-as-error) in one of our build configurations. The original code is fine in terms of expressing its intent, so the right answer might actually be to silence the warning. * fixup: make sure to actually initialize	02 July 2020, 18:45:59 UTC
54675a3	jsmall-nvidia	02 July 2020, 13:09:52 UTC	Only call m_api functions if m_api has been validly set on dtor of VulkanDeviceQueue. (#1426)	02 July 2020, 13:09:52 UTC
6cbb88f	jsmall-nvidia	01 July 2020, 21:44:46 UTC	Disable dynamic dispatch tests on CUDA - as fails with exception about unhandled op. (#1425)	01 July 2020, 21:44:46 UTC
8c33e7b	jsmall-nvidia	01 July 2020, 20:11:56 UTC	Ignore tests that don't have all the rendering APIs they require available. (#1419)	01 July 2020, 20:11:56 UTC
5c15329	jsmall-nvidia	01 July 2020, 18:20:42 UTC	Fix bug in slang-dxc-support where it didn't get the source path correctly (#1420) * Fix handling of UniformState from #1396 * * Fix bug in slang-dxc-support where it didn't get the source path correctly * Make entryPointIndices const List<Int>&	01 July 2020, 18:20:42 UTC
69a0595	jsmall-nvidia	01 July 2020, 16:45:02 UTC	Fix handling of UniformState from #1396 (#1417)	01 July 2020, 16:45:02 UTC
8ced9d2	Tim Foley	30 June 2020, 19:25:27 UTC	Clean up unused code for IR object ownership (#1416) There was a small but non-trivial amount of code across `IRModule`, the `ObjectScopeManager`, and `StringRepresentationCache` that had to do with managing the lifetimes of `RefObject`s that might be referenced by IR instructions (and thus need to be kept alive for the lifetime of the IR module). We have long since migrated to a model where IR instruction do not include owned references to `RefObject`s, so these facilities weren't actually needed. This streamlines `IRModule`'s declaration, and trims code that we aren't actually using. One note for the future is that the `StringRepresentationCache` no longer does what its name implies (it is not a cache of `StringRepresentation`s), so we should consider giving it a more narrowly scoped name. I didn't include that in this change because I wanted to keep the diffs narrow and easy to review. A follow-on renaming change should be trivial if/when we can agree on what the type should be called at this point. Alternatively, we could simply bake the functionality of `StringRepresentationCache` into he IR deserialiation logic itself, since that is the only code using it.	30 June 2020, 19:25:27 UTC
dc44b08	Tim Foley	30 June 2020, 17:01:09 UTC	Initial work on property declarations (#1410) * Initial work on property declarations Introduction ============ The main feature added here is support for `property` declarations, which provide a nicer experience for working with getter/setter pairs. If existing code had something like this: ```hlsl struct Sphere { float4 centerAndRadius; // xyz: center, w: radius float3 getCenter() { return centerAndRadius.xyz; } void setCenter(float3 newValue) { centerAndRadius.xyz = newValue; } // similarly for radius... } void someFunc(in out Sphere s) { float3 c = s.getCenter(); s.setCenter(c + offset); } ``` It can be expressed instead using a `property` declaration for `center`: ```hlsl struct Sphere { float4 centerAndRadius; // xyz: center, w: radius property center : float3 { get { return centerAndRadius.xyz; } set(newValue) { centerAndRadius.xyz = newValue; } } // similarly for radius... } void someFunc(in out Sphere s) { float3 c = s.center; s.center = c + offset; } ``` The benefits at the declaration site aren't that signficiant (e.g., in the example above we actually have slightly more lines of code), but the improvement in code clarity for users is significant. Having `property` declarations should also make it easier to migrate from a simple field to a property with more complex logic without having to first abstract the use-site code using a getter and setter. An important future benefit of `property` syntax will be if we allow `interface`s to include `property` requirements, and then also allow those requirements to be satisfied by ordinary fields in concrete types. Subscripts ---------- The Slang compiler already has limited (stdlib-use-only) support for `__subscript` declarations, which are conceptually similar to `operator[]` from the C++ world, but are expressed in a way that is more in line with `subscript` declarations in Swift. A `SubscriptDecl` in the AST contains zero or more `AccessorDecl`s, which correspond to the `get` and `set` clauses inside the original declaration (there is also a case for a `__ref` accessor, to handle the case where access needs to return a single address/reference that can be atomically mutated). A major goal of the implementation here is to re-use as much of the infrastructure as possible for `__subscript` declarations when implementing `property` declarations. Nonmutating Setters ------------------- One additional thing added in this change is the ability to mark a `set` accessor on either a subscript or a property as `[nonmutating]`, and indeed all of the existing `set` accessors declared in the stdlib have been marked this way. The need for this modifier is a bit subtle. If we think about a typical subscript or property: ```hlsl struct MyThing { int f; property p : int { get { return f; } set(newValue) { f = newValue; } } } ``` it is clear we want the `set` accessor to translate to output HLSL as something like: ``` void MyThing_p_set(inout MyThing this, int newValue) { this.f = newValue; } ``` Note how the implicit `this` parameter is `inout` even though we didn't mark anything as `[mutating]`. This is the obvious thing a user would expect us to generate given a property declaration. Now consider a case like the following: ```hlsl struct MyThing { RWStructuredBuffer<int> storage; property p : int { get { return storage[0]; } set(newValue) { storage[0] = newValue; } } } ``` This new declaration doesn't require (or want) an `inout` `this` parameter at all: ``` void MyThing_p_set(MyThing this, int newValue) { this.storage[0] = newValue; } ``` In fact, given the limitations in the current Slang compiler around functions that return resource types (or use them for `inout` parameters), we can only support a `set` operation like this if we can ensure that the `this` parameter is considered to be `in` instead of `inout`. This is exactly the behavior we allow users to opt into with a `[nonmutating] set` declaration. All of the subscript operations in the stdlib today have `set` accessors that don't actually change the value of `this` that they act on (e.g., storing into a `RWStructuredBuffer` using its `operator[]` doesn't change the value of the `RWStructuredBuffer` variable -- just its contents). We'd gotten away without this detail so far just because `set` accessors were only being declared in the stdlib and they were all implicitly `[nonmutating]` anyway, so it never surfaced as an issue that the code we generated assumed a setter wouldn't change `this`. Implementation ============== Parser and AST -------------- Adding a new AST node for `PropertyDecl` and the relevant parsing logic was mostly straightforward. The biggest change was allowing a `set` declaration to introduce an explicit name for the parameter that represents the new value to be set. This change also adds a `[nonmutating]` attribute as a dual to `[mutating]`, for reasons I will get to later. Semantic Checking ----------------- The `getTypeForDeclRef` logic was updated to allow references to `property` declarations. Some of the semantic checking work for subscripts was pulled out into re-usable subroutines to allow it to be shared by `__subscript` and `property` declarations. The checking of accessor declarations, which sets their result type based on the type of the outer `__subscript` was changed to also handle an outer `property`. Some special-case logic was added for checking of `set` declarations to make sure that their parameter is given the expected type. Some logic around deciding whether or not `this` is mutable had to be updated to correctly note that `this` should be mutable by default in a `set` accessor, with an explicit `[nonmutating]` modifier required to opt out of this default. (This is the inverse of how a typical method or `get` accessor works). IR Lowering ----------- The good news is that after IR lowering, access to properties turns into ordinary function calls (equivalent to what hand-written getters and setters would produce), so that subsequent compiler steps (including all the target-specific emit logic) doesn't have to care about the new feature. The bad news is that adding `property` declarations has revealed a few holes in how IR lowering was handling `__subscript` declarations and their accessors, so that it didn't trivially work for the new case as-is. The IR lowering pass already has the `LoweredValInfo` type that abstractly represents a value that resulted from lowering some AST code to the IR. One of the cases of `LoweredValInfo` was `BoundSubscript` that represented an expression of the form `baseVal[someIndex]` where the AST-level expression referenced a `__subscript` declaration. The key feature of `BoundSubscript` is that it avoided deciding whether to invoke the getter, the setter, or both "too early" and instead tried to only invoke the expected/required operations on-demand. This change generalizes `BoundSubscript` to handle `property` references as well, so it changes to `BoundStorage`. Making the type handle user-defined property declarations required fixing a bunch of issues: * When building up argument lists in the IR, we need to know whether an argument corresponds to an `in` or an `out`/`inout` parameter, to decide whether to pass the value directly or a pointer to the value. Some of the logic in the lowering pass had been playing fast and loose with this, so this change tries to make sure that whenever we care computing a list of `IRInst` that represent the arguments to a call we have the information about the corresponding parameter. Similarly, when emitting a call to an accessor in the IR, the information about the expected type of the callee was missing/unavailable, and the code was incorrectly building up the expected type of the callee based on the types of the arguments at the call site. The logic has been changed so that we can extract the expected signature of an accessor (how it will be translated to the IR) using the same logic that is used to produce the actual `IRFunc` for the accessor (so hopefully both will always agree). * Dealing with `in` vs. `inout` differences around parameters means also dealing with the "fixup" code that is used to assign from the temporary used to pass an `inout` argument back into the actual l-value expression that was used. That logic has all been hoisted out of the expression visitor(s) and into the global scope. Future Work =========== The entire approach to handling l-values in the IR lowering pass is broken, and it is in need a of a complete rewrite based on new first-principles design goals. While something like `LoweredValInfo` is decent for abstracting over the easy cases of r-values, addresses, and a few complicated l-value cases like swizzling, it just doesn't scale to highly abstract l-values like we get from `__subcript` and `property` declarations, nor other corner cases of l-values that we need to handle (e.g., passing an `int` to an `inout float` parameter is allowed in HLSL, and performs conversions in both directions!). It Should be Easy (TM) to extend the logic that tries to synthesize an interface conformance witness method when there isn't an exact match to also support synthesizing a property declaration (plus its accessors) to witness a required property when the type has a field of the same name/type. * fixup: pedantic template parsing error (thanks, clang!) * fixup: cleanups and review feedback * Removed some `#ifdef`'d out code from merge change * Added proper diagnostics for accessor parameter constraints, which led to some fixes/refactorings * Added a test case for the accessor-related diagnostics	30 June 2020, 17:01:09 UTC
47b43f8	Dietrich Geisler	29 June 2020, 21:42:12 UTC	Backend for Multiple Entry Points (#1411) * Backend for Multiple Entry Points Introduces the basic backend on the compiler for zero or more entry points. Entry points have been extended to lists for several functions, with loopFunctions have been extended to take in entry points and indices as appropriate, to allow for multiple entry points once the frontend is expanded. Several functions are currently being assumed to have a single entry point for simplicity and provide a work in progress commit. * Progress on debugging fixes * Tests passing * Refactored emitEntryPoints * Updated lists to be by constant reference * Fixes to formatting * Refactoring updates for the compiler * Fix for compilation errors * Reformatting * More reformatting * Moved struct around to help with compilation Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	29 June 2020, 21:42:12 UTC
3e8bdb6	Yong He	26 June 2020, 18:59:33 UTC	Merge pull request #1408 from csyonghe/dyndispatch2 Dynamic dispatch for generic interface requirements and `associatedtype`	26 June 2020, 18:59:33 UTC
4e44398	Yong He	26 June 2020, 17:20:01 UTC	Merge remote-tracking branch 'official/master' into dyndispatch2	26 June 2020, 17:20:01 UTC
d084f63	jsmall-nvidia	26 June 2020, 16:40:31 UTC	AST serialize improvements (#1412) * Try to fix problem with C++ extractor concating tokens producing an erroneous result. * Improve naming/comments around C++ extractor fix. * Another small improvement around space concating when outputing token list. * Handle some more special cases for consecutive tokens for C++ extractor concat of tokens. * WIP AST serialization. * Comment out so compile works. * More work on AST serialization. * WIP AST serialize. * WIP AST Serialization - handling more types. * WIP: Compiles but not all types are converted, as not all List element types are handled. * Compiles with array types. * Finish off AST serialization of remaining types. * Remove ComputedLayoutModifier and TupleVarModifier. * Add fields to ASTSerialClass type. * Construct AST type layout. * AST Serialization working for writing to ASTSerialWriter. * Removed call to ASTSerialization::selfTest in session creation. * Fixes for gcc. * Diagnostics handling - better handling of dashify. * Improve comment around DiagnosticLookup. * Updated VS project. * Write out as a Stream, taking into account alignment. * First pass at serializing in AST. * Added support for deserializing arrays. * Small bug fixes. * Fix problem calculating layout. Split out loading on entries. * Fix typo in AST conversion. * Add some flags to control AST dumping. * Fix bug from a typo. * Special case handling of Name* in AST serialization. * Special case handling of Token lexemes, make Names on read. * Documentation on AST serialization. * ASTSerialTestUtil - put AST testing functions. Fix typo that broke compilation. * Fix typo.	26 June 2020, 16:40:31 UTC
4cf7119	Yong He	26 June 2020, 02:56:39 UTC	Add a TODO comment for generic interface requirement key	26 June 2020, 02:56:39 UTC
dd88ba1	Yong He	26 June 2020, 02:03:51 UTC	Fixes	26 June 2020, 02:03:51 UTC
5b57195	Yong He	26 June 2020, 01:26:12 UTC	Fixes.	26 June 2020, 01:26:12 UTC
09c64ac	Yong He	25 June 2020, 23:29:39 UTC	Merge remote-tracking branch 'official/master' into dyndispatch2	25 June 2020, 23:29:39 UTC
218a39b	Yong He	25 June 2020, 23:29:07 UTC	remove ThisPointerDecoration, generate IRInterfaceType in one pass	25 June 2020, 23:29:07 UTC
509e36b	Yong He	25 June 2020, 21:01:33 UTC	Remove interfaceType operand from lookup_witness_method inst	25 June 2020, 21:01:33 UTC
892acc4	jsmall-nvidia	25 June 2020, 20:41:14 UTC	AST Serialize Reading (#1409) * Try to fix problem with C++ extractor concating tokens producing an erroneous result. * Improve naming/comments around C++ extractor fix. * Another small improvement around space concating when outputing token list. * Handle some more special cases for consecutive tokens for C++ extractor concat of tokens. * WIP AST serialization. * Comment out so compile works. * More work on AST serialization. * WIP AST serialize. * WIP AST Serialization - handling more types. * WIP: Compiles but not all types are converted, as not all List element types are handled. * Compiles with array types. * Finish off AST serialization of remaining types. * Remove ComputedLayoutModifier and TupleVarModifier. * Add fields to ASTSerialClass type. * Construct AST type layout. * AST Serialization working for writing to ASTSerialWriter. * Removed call to ASTSerialization::selfTest in session creation. * Fixes for gcc. * Diagnostics handling - better handling of dashify. * Improve comment around DiagnosticLookup. * Updated VS project. * Write out as a Stream, taking into account alignment. * First pass at serializing in AST. * Added support for deserializing arrays. * Small bug fixes. * Fix problem calculating layout. Split out loading on entries. * Fix typo in AST conversion.	25 June 2020, 20:41:14 UTC
a1fed5e	Yong He	25 June 2020, 20:19:45 UTC	Partial fixes to code review comments	25 June 2020, 20:23:28 UTC
ffa9a35	Yong He	25 June 2020, 01:09:40 UTC	Fix `lowerFuncType` and small bug fixes.	25 June 2020, 03:25:49 UTC
161c525	Yong He	24 June 2020, 21:22:52 UTC	Fixes.	25 June 2020, 01:10:15 UTC
0ca75fe	Yong He	24 June 2020, 20:16:11 UTC	Dynamic dispatch for generic interface requirements. -Lower interfaces into actual `IRInterfaceType` insts. -Lower `DeclRef<AssocTypeDecl>` into `IRAssociatedType` -Generate proper IRType for generic functions. -Add a test case exercising dynamic dispatching a generic static function through an associated type. -Bug fixes for the test case.	25 June 2020, 01:10:15 UTC
3fe4f53	Dietrich Geisler	24 June 2020, 21:22:58 UTC	Heterogeneous example (#1399) * Introduced heterogeneous example. Example includes C++ source and header files, and does not currently make use of the associated slang file when building. The intent of this commit is to introduce the example as a baseline for later updates as the heterogeneous model is expanded. * Changing namespace * Renamed and rewrote README * Updated example to account for compiler updates * Updated path Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	24 June 2020, 21:22:58 UTC
ae41db8	jsmall-nvidia	24 June 2020, 17:56:06 UTC	AST Serialization writing (#1407) * Try to fix problem with C++ extractor concating tokens producing an erroneous result. * Improve naming/comments around C++ extractor fix. * Another small improvement around space concating when outputing token list. * Handle some more special cases for consecutive tokens for C++ extractor concat of tokens. * WIP AST serialization. * Comment out so compile works. * More work on AST serialization. * WIP AST serialize. * WIP AST Serialization - handling more types. * WIP: Compiles but not all types are converted, as not all List element types are handled. * Compiles with array types. * Finish off AST serialization of remaining types. * Remove ComputedLayoutModifier and TupleVarModifier. * Add fields to ASTSerialClass type. * Construct AST type layout. * AST Serialization working for writing to ASTSerialWriter. * Removed call to ASTSerialization::selfTest in session creation. * Fixes for gcc. * Diagnostics handling - better handling of dashify. * Improve comment around DiagnosticLookup. * Updated VS project.	24 June 2020, 17:56:06 UTC
b595dd0	Yong He	19 June 2020, 20:16:22 UTC	Merge pull request #1403 from tfoleyNV/struct-inheritance-and-interfaces Work on struct inheritance and interfaces	19 June 2020, 20:16:22 UTC
1988011	Tim Foley	19 June 2020, 18:50:15 UTC	fixup: review feedback	19 June 2020, 18:50:15 UTC
11e377a	Tim Foley	19 June 2020, 18:48:28 UTC	Merge remote-tracking branch 'origin/master' into struct-inheritance-and-interfaces	19 June 2020, 18:48:28 UTC
110d15b	Yong He	19 June 2020, 18:19:51 UTC	Dynamic dispatch for static member functions of associatedtypes. (#1404)	19 June 2020, 18:19:51 UTC
fc4342b	Tim Foley	19 June 2020, 16:25:44 UTC	fixup: actually make the test case test something	19 June 2020, 16:25:44 UTC
5fbb9ff	Yong He	19 June 2020, 06:15:39 UTC	Merge pull request #1401 from jsmall-nvidia/feature/prelude-fix Prelude fix/disable memaccess warning on gcc	19 June 2020, 06:15:39 UTC
0eddf45	Tim Foley	17 June 2020, 21:55:46 UTC	Work on struct inheritance and interfaces The main new feature that works here is that a derived `struct` type can satisfy one or more interface requirements using methods it inherited from a base `struct` type: ```hlsl interface ICounter { [mutating] void increment(); } struct CounterBase { int val; [mutating] void increment() { val++; } } struct ResetableCounter : CounterBase, ICounter { [mutating] void reset() { val = 0; } } ``` Here the derived `ResetableCounter` type is satisfying the `increment()` requirement from `ICounter` using the inherited `CounterBase` method instead of one defined on `ResetableCounter`. The crux of the problem here was that after lowering to HLSL/GLSL, the above code looks something like: ```hlsl struct CounterBase { int val; }; void CounterBase_increment(in out CounterBase this) { this.val++; } struct ResetableCounter { CounterBase base; } void ResetableCounter_reset(in out ResetableCounter this) { this.base.val = 0; } ``` The central problem is that `CounterBase_increment` here is not type-compatible what we expect to find in the witness table for `ResetableCounter : ICounter`: the `this` parameter has the wrong type! The basic solution strategy here is to intercept the search for a witness to sastify an interface requirement in `findWitnessForInterfaceRequirement` (those witnesses get collected into a witness table). The revised logic first looks for an exact match, which will only consider members introduced for the type itself, and not those introduced by base types. If an exact match for a method requirement is not found, the semantic checker then tries to synthesize a witness for the requirement, which more or less amounts to generating a function like: ```hlsl [mutating] void ResetableCounter::synthesized_increment() { this.increment(); } ``` The body of that synthesized method will type-check just fine in this case (because it desugars into `this.base.increment()`, more or less), and thus the synthesized method declaration can be used as the actual witness that drives downstream code generation. Details: * I added some options to lookup to allow us to explicitly skip member lookup through base interfaces; this should make sure that we don't accidentally satisfy an interface requirement using a member of the same or another interface (since such members are conceptually `abstract`). * As it originally stood, the semantic checker was allowing `CounterBase.increment()` to satisfy the `increment()` requirement of `ResetableCounter` directly, with the result that we got invalid HLSL/GLSL code as output. In order to avoid this and other bad cases, I made sure that the "exact match" case of requirement satisfaction ignores members that included any "breadcrumbs" in the lookup result item (since the breadcrumbs would all indicate transformations that needed to be applied to `this` to find the right member). * If we eventually have targets where `this` is passed by pointer/reference in all cases, then all of this work is not needed for the common case of single inheritance, and the base-type method should be usable as a witness directly. I don't see any easy way to handle that special case without producing target-dependent code in the front-end. It might be that we need an IR pass that can detect functions that are trivial "forwarding" functions and replace them with the function they forward to. * This change includes a test case that should have come along with the original PR that started adding struct inheritance Caveats: * The comments in this change talk about things like allowing a method with a default parameter to satisfy a requirement without that parameter. That scenario won't actually work at present because we still have an enormous hack in our logic for checking methods against requirements: we don't actually consider their signatures! I couldn't fold a fix for that issue into this change because there are subtle corner cases around associated types that we need to handle correctly (which were part of the reason why the checking is as hacked as it is) * This change does not try to test or address the case where we want to have a `Derived` type conform to `ISomething` because it inherits from `Base` and `Base : ISomething`. That case has its own details that need to be worked out, but ideally can follow a similar implementation strategy when it comes to re-using methods from `Base` to satisfy requirement on `Derived`.	18 June 2020, 23:00:40 UTC
515d8eb	Yong He	18 June 2020, 22:06:14 UTC	Merge branch 'master' into feature/prelude-fix	18 June 2020, 22:06:14 UTC
aa6aca4	Yong He	18 June 2020, 22:05:58 UTC	Merge pull request #1400 from csyonghe/dyndispatch Dynamic dispatch non-static functions.	18 June 2020, 22:05:58 UTC
82ba914	Tim Foley	18 June 2020, 20:40:08 UTC	Merge branch 'master' into dyndispatch	18 June 2020, 20:40:08 UTC
5952e3b	jsmall-nvidia	18 June 2020, 20:39:06 UTC	Prelude is associated with SourceLanguage (#1398) * Associate a downstream compiler for prelude lookup even if output is source. * Remove LanguageStyle and just use SourceLanguage instread. * Added set/getPrelude. Made prelude work on source language. * Fix typo in method name replacement. get/SetPrelude get/setLanguagePrelude * Fix issue because of method name change. * Remove getPreludeDownstreamCompilerForTarget	18 June 2020, 20:39:06 UTC
e2d2102	jsmall-nvidia	18 June 2020, 18:46:14 UTC	Try using cmath or math.h depending on compiler to avoid issues around isinf etc.	18 June 2020, 18:46:14 UTC
dfbe3cf	jsmall-nvidia	18 June 2020, 18:17:57 UTC	Fix and improvements around repro (#1397) * * Fix output in slang repro command line * Profile uses lowerCamel method names (had mix of upper and lower) * Rename slang-serialize-state/SerializeStateUtil to slang-repro and ReproUtil.	18 June 2020, 18:17:57 UTC
48da3ed	jsmall-nvidia	18 June 2020, 16:44:51 UTC	#include <cmath> Use SLANG_PRELUDE_STD macro to prefix functions that may need to be specified in std:: namespace.	18 June 2020, 16:44:51 UTC
5a86cd4	jsmall-nvidia	18 June 2020, 15:38:30 UTC	Improvements around C++ code generation (#1396) * * Remove UniformState and UniformEntryPointParams types * Put all output C++ source in an anonymous namespace * If SLANG_PRELUDE_NAMESPACE is set, make what it defines available in generated file. * Fix signature issue in performance-profile.slang * Context -> KernelContext to avoid ambiguity. * Fix issues around dynamic dispatch and anonymous namespace. * Fix typo.	18 June 2020, 15:38:30 UTC
f9b5f18	jsmall-nvidia	18 June 2020, 15:36:58 UTC	* Fix warnings from prelude * Make compilation work on gcc by disabling -Wclass-mem-access	18 June 2020, 15:36:58 UTC
31ae346	jsmall-nvidia	18 June 2020, 12:10:47 UTC	Associate a downstream compiler for prelude lookup even if output is source. (#1395) Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	18 June 2020, 12:10:47 UTC
8c6e02b	Yong He	18 June 2020, 05:41:56 UTC	Dynamic dipatch non-static functions.	18 June 2020, 06:47:11 UTC
d1a8cd2	Tim Foley	17 June 2020, 23:30:18 UTC	Add != operator for enum types (#1394) This was an oversight in the stdlib, and the `!=` definition follows the `==` in a straightforward fashion.	17 June 2020, 23:30:18 UTC
cd7f01b	Yong He	17 June 2020, 20:08:27 UTC	Generate dynamic C++ code for the minimal test case. (#1391) * Add IR pass to lower generics into ordinary functions. * Fix project files * Emit dynamic C++ code for simple generics and witness tables. Fixes #1386. * Remove -dump-ir flag. * Fixups.	17 June 2020, 20:08:27 UTC
ca503d4	jsmall-nvidia	17 June 2020, 16:15:29 UTC	Hotfix/slangc unreleased compile request (#1393) * Releases compile request if there is an error. * Arrange so that caller can clean up CompileRequest so don't have to capture all paths.	17 June 2020, 16:15:29 UTC
40370ac	Yong He	16 June 2020, 19:15:26 UTC	Merge pull request #1392 from tfoleyNV/premake-bat Add a batch file for invoke premake	16 June 2020, 19:15:26 UTC
aa925d3	Tim Foley	16 June 2020, 16:54:06 UTC	Add a batch file for invoke premake This change adds `./premake.bat` to the repository, which users in Windows (64-bit) can use to conveniently invoke the copy of premake that is pulled via the `slang-binaries` submodule. It should be possible to pass whatever options you passed to `premake5.exe` through to `premake.bat`. E.g., if you invoke: ``` .\premake.bat vs2015 ``` then you should get the desired results for the project/solution files we want to check in.	16 June 2020, 16:54:06 UTC
8ec293c	Yong He	16 June 2020, 01:10:02 UTC	Merge pull request #1390 from csyonghe/glsl-loop Emit [[dont_unroll]] GLSL attribute for [loop] attribute.	16 June 2020, 01:10:02 UTC
926e4bb	Tim Foley	15 June 2020, 20:56:11 UTC	Merge branch 'master' into glsl-loop	15 June 2020, 20:56:11 UTC
3461ed4	Yong He	15 June 2020, 20:55:56 UTC	Specialize function calls involving array arguments. (#1389) Fixes #890. Co-authored-by: Tim Foley <tfoleyNV@users.noreply.github.com>	15 June 2020, 20:55:56 UTC
d84cfb7	Tim Foley	15 June 2020, 19:05:04 UTC	Remove implicit conversions to `void` (#1388) * Remove implicit conversions to `void` Fixes #1372 The standard library code had accidentally introduced implicit-conversion `__init` operations on the `void` type that accepted each of the other basic types, so that a function written like: ```hlsl void bad() { return 1; } ``` would translate to: ```hlsl void bad() { return (void)1; } ``` The dual problesm are that the input code should have produced a diagnostic of some kind, and the output code doesn't appear to compile correctly through fxc. This change introduces several fixes aimed at this issue: * First, the problem in the stdlib code is plugged: we don't introduce implicit conversion operations to or from `void` (we'd only been banning it in one direction before) * Next, an explicit `__init` was added to `void` that accepts any type so that existing HLSL code that might do `(void) someExpression` to ignore a result will continue to work. This is a compatibility feature, and it might be argued that we should at least warn when it is used. Note that this function is expected to never appear in output HLSL/GLSL because its result will never be used, and it is marked `[__readNone]` allowing calls to it to be eliminated as dead code. * During IR lowering, we now take care to only emit the `IRReturnVal` instruction type if there is a non-`void` value being returned, and use `IRReturnVoid` for both the case where no expression was used in the `return` statement and the case where an expression of type `void` is returned. * A test case was added to confirm that returning `1` from a `void` function isn't allowed, while returning `(void) 1` is. The net result of these changes is that we now produce an error for the bad input code, we allow explicit casts to `void` as a compatibility feature, and we are more robust about treating `void` as if it is an ordinary type in the front-end. * fixup: missing file	15 June 2020, 19:05:04 UTC
7e7425d	Yong He	15 June 2020, 16:05:49 UTC	Merge branch 'master' into glsl-loop	15 June 2020, 16:05:49 UTC
90444f8	Yong He	15 June 2020, 16:04:53 UTC	Generate IRType for interfaces, and reference them as `operand[0]` in IRWitnessTable values (#1387) * Generate IRType for interfaces, and use them as the type of IRWitnessTable values. This results the following IR for the included test case: ``` [export("_S3tu010IInterface7Computep1pii")] let %1 : _ = key [export("_ST3tu010IInterface")] [nameHint("IInterface")] interface %IInterface : _(%1); [export("_S3tu04Impl7Computep1pii")] [nameHint("Impl.Compute")] func %Implx5FCompute : Func(Int, Int) { block %2( [nameHint("inVal")] param %inVal : Int): let %3 : Int = mul(%inVal, %inVal) return_val(%3) } [export("_SW3tu04Impl3tu010IInterface")] witness_table %4 : %IInterface { witness_table_entry(%1,%Implx5FCompute) } ``` * Fixes per code review comments. Moved interface type reference in IRWitnessTable from their type to operand[0]. * Fix typo in comment.	15 June 2020, 16:04:53 UTC
04a81ab	Yong He	13 June 2020, 07:24:21 UTC	Emit [[dont_unroll]] attribute in GLSL	13 June 2020, 07:25:12 UTC
36a06f1	Tim Foley	12 June 2020, 20:30:32 UTC	Diagnose circularly-defined constants (#1384) * Diagnose circularly-defined constants Work on #1374 This change diagnoses cases like the following: ```hlsl static const int kCircular = kCircular; static const int kInfinite = kInfinite + 1; static const int kHere = kThere; static const int kThere = kHere; ``` By diagnosing these as errors in the front-end we protect against infinite recursion leading to stack overflow crashes. The basic approach is to have front-end constant folding track variables that are in use when folding a sub-expression, and then diagnosing an error if the same variable is encountered again while it is in use. In order to make sure the error occurs whether or not the constant is referenced, we invoke constant folding on all `static const` integer variables. Limitations: * This only works for integers, since that is all front-end constant folding applies to. A future change can/should catch circularity in constants at the IR level (and handle more types). * This only works for constants. Circular references in the definition of a global variable are harder to diagnose, but at least shouldn't result in compiler crashes. * This doesn't work across modules, or through generic specialization: anything that requires global knowledge won't be checked * fixup: missing files * fixup: review feedback	12 June 2020, 20:30:32 UTC
2359921	Yong He	12 June 2020, 00:13:27 UTC	Merge pull request #1383 from csyonghe/dyndispatch Add compiler flag to disable specialization pass.	12 June 2020, 00:13:27 UTC
8452129	Yong He	11 June 2020, 18:10:40 UTC	Merge branch 'master' into dyndispatch	11 June 2020, 18:10:40 UTC
1c77c44	jsmall-nvidia	11 June 2020, 18:06:27 UTC	Fix problem with C++ extractor ernoneous concating of type tokens (#1382) * Try to fix problem with C++ extractor concating tokens producing an erroneous result. * Improve naming/comments around C++ extractor fix. * Another small improvement around space concating when outputing token list. * Handle some more special cases for consecutive tokens for C++ extractor concat of tokens.	11 June 2020, 18:06:27 UTC
8de0a2e	Yong He	10 June 2020, 21:57:30 UTC	Add compiler flag to disable specialization pass.	10 June 2020, 21:57:30 UTC
98459ba	Yong He	09 June 2020, 17:35:26 UTC	Merge pull request #1381 from csyonghe/master Generate .tar.gz file in linux release	09 June 2020, 17:35:26 UTC
00e0e25	Yong He	08 June 2020, 23:16:12 UTC	Generate .tar.gz file in linux release	08 June 2020, 23:17:34 UTC
78696a6	jsmall-nvidia	08 June 2020, 19:28:48 UTC	Small fixes/improvements based on review. (#1379)	08 June 2020, 19:28:48 UTC
b3fbb92	Yong He	08 June 2020, 16:28:03 UTC	Merge pull request #1378 from csyonghe/fix Filter lookup results from interfaces in `visitMemberExpr`.	08 June 2020, 16:28:03 UTC
956ede9	Yong He	06 June 2020, 02:38:43 UTC	Filter lookup results from interfaces in `visitMemberExpr`. Fixes #1377	06 June 2020, 02:43:30 UTC
7d4432b	Yong He	06 June 2020, 02:34:55 UTC	Merge pull request #1375 from csyonghe/findtypebynamefix Fix FindTypeByName reflection API not finding stdlib types.	06 June 2020, 02:34:55 UTC
52026c7	Yong He	06 June 2020, 01:34:24 UTC	Merge branch 'master' into findtypebynamefix	06 June 2020, 01:34:24 UTC
43c1467	jsmall-nvidia	05 June 2020, 22:20:09 UTC	ASTNodes use MemoryArena (#1376) * Add a ASTBuilder to a Module Only construct on valid ASTBuilder (was being called on nullptr on occassion) * Add nodes to ASTBuilder. * Compiles with RefPtr removed from AST node types. * Initialize all AST node pointer variables in headers to nullptr; * Initialize AST node variables as nullptr. Make ASTBuilder keep a ref on node types. Make SyntaxParseCallback returns a NodeBase * Don't release canonicalType on dtor (managed by ASTBuilder). * Give ASTBuilders a name and id, to help in debugging. For now destroy the session TypeCache, to stop it holding things released when the compile request destroys ASTBuilders. * Moved the TypeCheckingCache over to Linkage from Session. * NodeBase no longer derived from RefObject. * Only add/dtor nodes that need destruction. First pass compile on linux.	05 June 2020, 22:20:09 UTC
92fc3aa	Yong He	05 June 2020, 20:01:06 UTC	Merge branch 'master' into findtypebynamefix	05 June 2020, 20:01:06 UTC

Newer
Older