https://github.com/shader-slang/slang

sort by:
Revision Author Date Message Commit Date
ab64cf2 Fix output for matrix multiple in GLSL code (#228) When Slang sees a matrix multiplication `M * v` in GLSL code it should (obviously) output GLSL code that also does `M * v`, but there was a bug introduced where the type-checker manages to resolve the operation and recognize it as a matrix-vector multiply, and then the code-generation logic says "oh, I'm generating output for GLSL, and that is reversed from HLSL/Slang, so I'd better reverse these operands!" and outputs `v * M`... which isn't what we want. I've fixed the problem in an expedient way, by having the front-end resolve the operation to what it believes is an intrinsic multiply operation, rather than a matrix-vector operation. If we ever support cross compilation *from* GLSL (unlikely), we've need to fix this up so that we have both real matrix-vector multiplies and "reversed" multiplies where the operands folow the GLSL convention). I've added two tests here to confirm the fix. The one under `tests/bugs` catches the actual issue described above, and confirms the fix. The other one under `tests/cross-compile` is just to make sure that we *do* properly reverse the operands to a matrix-vector product when converting from Slang to GLSL. 23 October 2017, 17:37:07 UTC
624d122 Fix up emission of shader parameter semantics when using IR (#226) * Fix up emission of shader parameter semantics when using IR - Make sure to propagate entry point parameter layouts down to IR parameters when doing the initial cloning to form target-specific IR - When layout information is present on an IR node, prefer to use that over the original high-level declaration for outputting semantics in final HLSL - Fix up test runner to generate `.actual` files when running compute tests, in cases where the `render-test` application errors out (e.g., because of a Slang compilation error) - Add a first test of generics functionality, to show that they generate valid code through the IR - Right now this test is *not* using any "interesting" operations on the type parameter, so this is not a test that can confirm that interface constraints work * fixup: skip compute tests when running on Linux 20 October 2017, 22:16:10 UTC
7ba937f Merge pull request #225 from csyonghe/master Support running compute shaders in testing framework 19 October 2017, 23:38:07 UTC
5a18dc7 typo fix 19 October 2017, 22:20:29 UTC
8ff7412 Support running and comparing execution results of compute shaders in testing framework. 19 October 2017, 22:18:21 UTC
88023ae Reflection: allow querying of semantics on varying input/output (#224) This is functionality required to support a Falcor bug fix. Most of the code to compute the right semantic name/index for a parameter was already present. This change adds: - Storage for semantic name/index on every `VarLayout` - Note: this is wasteful and should be optimized later - A public API to query the semantic name/index - The contract is that this API returns `NULL` if the parameter had no semantic - A bit of work in `parameter-binding.cpp` to attach semantics to varying input/output when traversing varying parameters. - Note: this is intentionally set up so that it associates semantics even with non-leaf parameters, so that an API user can query the semantic of a `struct` parameter and know that its members will be assigned sequential semantic indices from its starting value. - Support for dumping this information in reflection tests One notable thing that I did *not* change here is that the reflection test fixture doesn't report information on the output of an entry point, even though it really should. That should be fixed in a separate change, though, because it would affect many of the expected outputs. 19 October 2017, 18:49:16 UTC
5995b7a Initial work on a pass to scalarize GLSL varying input/output (#223) There was already a pass in place that transformed parameters and results of an entry-point function into global variables for GLSL, but this pass would just turn a `struct`-type parameter into a `struct`-type global, which has two problems: - The standard GLSL language doesn't seem to allow `struct` types as vertex shader inputs or fragment shader outputs. - If there are any members in such a `struct` that represent "system value" inputs or outputs, then these would need to be transformed into the equivalent `gl_*` variables. This change adds a more complete scalarization process that applies to inputs/outputs during the legalization pass. In order to support this there is a little bit of a data strcuture for abstracting over tuples of values (this same idiom is used in a few other places, so perhaps the implementation could be done once and shared?). System values are current handled in a painfully ad hoc (and incomplete) fashion during code emit. We need to come up with a better solution for mapping HLSL `SV_*` semantics over to `gl_*` variables. In some cases this mapping might introduce more code than we can easily deal with during emit time, so it probably needs to be handled back at the IR level. This implementation has many gaps, but it appears to be enough to get teh `render/cross-compile-entry-point` test working with IR-based cross-compilation. 19 October 2017, 16:04:41 UTC
a12480f Work on IR-based cross-compilation (#222) There are two big changes here: - Add logic during the initial IR cloning pass for an entry point + target that tries to pick the best possible version of any target-overloaded function. This allows us to pick the intrinsic version of `saturate()` when compiling for HLSL output, but then pick the non-intrinsic version (that is implemented in terms of `clamp()`) when targetting GLSL. - Add an initial specialization pass that tries to deal with generics. This required some fixing work to IR generation, so that we correctly generate explicit operations to specialize a generic for specific types (this is currently implemented as a `specialize` instruction that takes the generic to specialize plus a declaration-reference that represents the specialized form). With that work in place, we can scan for `specialize` instructions inside of non-generic functions, and use them to trigger generation of specialized code. We rely on the name-mangling scheme to help us find pre-existing specializations when possible. There are also a bunch of cleanups encountered along the way: - Don't use the explicit `layout(offset=...)` for uniforms, because it isn't supported by all current drivers. For now we will just assume that our layout rules compute the same values that the driver would for un-marked-up code. We can come back later and try to implement a workaround in the cases where this doesn't apply (e.g., by re-running the layout logic as part of emission, and dropping layout modifiers from variables that don't need explicit layout). - Fix some issues in IR dump printing so that we print function declarations more nicely. - Testing: print out failing pixel when image-diff fails 18 October 2017, 18:08:47 UTC
f12c255 Implement notion of a "container format" (#213) The big addition here is that the Slang "bytecode" is no longer treated as just a "code generation target" (`CodeGenTarget`) akin to DX bytecode (DXBC) or SPIR-V, but instead is a `ContainerFormat` that can be used to emit all the results of a compile request (well, currently just the IR-as-BC, but the intention is there). Getting to this goal involved some prior checkins that eliminated bogus "targets" that weren't really akin to SPIR-V or DXBC: `-target slang-ir-asm` and `-target reflection-json`. Those targets were really in place to support testing, and so they've been made more explicit testing/debug options. This change eliminates `-target slang-ir` and instead tries to allow the user to specify `-o foo.slang-module` as an output file name, that indicates the intention to output a "container" file that will wrap up all the generated code. I've also gone ahead and generalized the existing `-target` option so that we are actually building up a *list* of code generation targets. This is largely just a cleanup, since it forces code to be more aware of when it is doing something target-specific vs. target independent. For example, reflection layout information lives on a requested target, and not on the compile request as a whole, and similarly output code is per-target, per-entry-point. As a cleanup, I eliminated support for per-translation-unit output. This was vestigial code from back when I used to try and do HLSL generation for a whole translation unit instead of per-entry-point (which turned out to be a lot of complexity for little gain), and it was only being used in the `hello` example and the `render-test` test fixture - in both cases fixing it up was easy enough. I've stubbed out the old `spGetTranslationUnitSource` API, but haven't removed it yet. 16 October 2017, 20:12:11 UTC
3e3e247 Get rid of the `-slang-ir-asm` target (#212) * Get rid of the `-slang-ir-asm` target This is really only useful for debugging, so I've replaced the functionality with a `-dump-ir` command line option (which dump's the IR for an entry point before doing codegen). * fixup: use HLSL target, not DXBC, so test can run on Linux 14 October 2017, 05:39:15 UTC
64ddefb Move reflection JSON generation into separate text fixture (#211) Move reflection JSON generation into separate test fixture 14 October 2017, 01:14:42 UTC
575230b Work towards target-specific function overloads (#210) * Checkpoint: interface conformance work - Add explicit definition of `saturate` for the GLSL target, which calls through to `clamp` - Needed to add explicit initializer to `__BuiltinFloatingPointType` to allow initialization from a single `float`, so that the `saturate` implementation can be sure that it can initialize a `T` from `0.0` or `1.0`. - This triggered errors in overload resolution, because the logic in place could not figure out that the `T` of the outer generic (`saturate<T>()`) conformed to the interface required by the callee. At this point I have the call to the scalar `clamp()` getting past type-checking, but not the vector or matrix cases. * More fixups for overload resolution inside generics - Make sure value parameters are treated the same as type parameters: we only want to solve for the parameters of the generic actually being applied, and not accidentally generate constraints for outer generics (e.g., when checking the body of a generic function). - Make sure that the diagnostics stuff uses the correct source manager when expanding the location of a builtin. * Fixes for function redeclaration - Handle case of redeclaring a generic function - Enumerate siblings in the parent of the *generic* not the parent of the *function* - Add logic to compare generic signatures - When generic signatures match, specialize functions to compatible generic arguments before comparing the function signatures - Fix redeclaration logic to *not* detect prefix/postifx operators as redeclarations of one another - Build an explicit representation of function redeclaration groups - First declaration is the "primary" and others are stored in a linked list - Make overload resolution handle redeclared functions - Only consider the primary declaration and skip others 12 October 2017, 20:53:42 UTC
9a231a5 Do loop fix (#209) * Bug fix: emit logic for `do` loops This case was never tested, and I was outputting some garbage characters. This comit fixes the codegen and adds a test case. * Bug fix: make sure to pass through `[allow_uav_condition]` This also fixes the standard library definition of `IncrementCounter()` so that it returns a `uint` instead of `void`. 12 October 2017, 17:30:48 UTC
28ca4dc Fixup: re-enable exception guard (#208) This is intended to produce a better experience for end users when they encounter internal compiler errors. 11 October 2017, 23:05:09 UTC
331ddfa Bug fixing (#207) * Bug fix for vector initializer lists When a vector was initialized with an initializer list: float4 f = { 0, 1, 2, 3 }; we were following the logic for `struct` types (since `vector<T,N>` is technically a `struct` declaration in our stdlib...), but the type has no field, so we were (silently!) ignoring the actual operands. I've applied a simple fix where we cast the operands to the element type of the vector, but a more complete fix will be needed sooner or later where we check the operand counts properly, etc. * Create implicit cast AST nodes when calling initializers The logic for dealing with implicit conversions was recently beefed up so that it would look at `__init` declarations in the target type, but in those cases the front-end would always create an `InvokeExpr` even when we would rather get an `ImplicitCastExpr` or (in the "rewrite" case) a `HiddenImplicitCastExpr`. I've fixed this up for now by constructing a dummy expression to stand in for the "original" call expression when creating the final call (luckily our `TypeCastExpr` is already just a specialized `InvokeExpr`). A better long-term solution might be to have implicit-ness or hidden-ness be modifiers or flags, rather than needing to use specialized forms of call nodes. * Fix subscript operator for `RWTexture1D` The index type was being declared as `uint1` instead of `uint`, and that created problems for downstream HLSL compilation when we introduced expressions like `uav[uint1(index)]` - the compiler would complain that a vector is not a valid index type. * Fix up constant-folding of integer casts. The old logic was checking for `InvokeExpr` before `TypeCastExpr`, but in the new setup a type cast *is* an `InvokeExpr`, so that case was never triggering. All of the constant-folding code really needs to be revisited, though, so that it can use a more general-purpose evaluation scheme like the bytecode (so that we can handle a moral equivalent of `constexpr` in the long run). * Fix implicit conversion costs for vector types A recent change made it so that the logic for looking up implicit conversions now uses declarations of initializers in the standard library (rather than hand-coding all the cases in `check.cpp`). One mistake made there was that we dropped the logic for computing implicit conversions between vectors of the same size, but different element types. These conversions were still allowed by a catch-all (generic) declaration in the standard library, but that declaration didn't include any implicit conversion cost logic (since it was generic, there was no single cost to use). This change explicitly enumerates the required conversions with their costs. It is a bit unfortunate that this is an O(N^2) amount of code for N base types, but that seems unavoidable for now. * Handle "lowering" of overloaded expressions If we are in the `-no-checking` mode and the user calls an overloaded function from an `__import`ed file in a way such that Slang can't resolve the intended overload, we were failing to emit the definitions of the potential callees. This change simply adds a case for `OverloadedExpr` in `lower.cpp` that explicitly lowers all the declarations that might have been referenced. - There is a potentially for breakage here if we are outputting GLSL and one of the overloads is stage-specific. - A more refined approach might try to recognize which over the overloaded options are even potentially applicable, and then output only those, but doing this would be way more complicated. I've added a test case for this behavior, but it is a bit brittle because we need to confirm that we still produce the same error message as unmodified HLSL. 11 October 2017, 21:39:10 UTC
892b339 Travis: try to fix deployment script (#206) 09 October 2017, 22:05:03 UTC
676560b Fix emit logic for `cbuffer` member with initializer (#205) Given an input declaration like: cbuffer C { int a = -1; } Slang was automatically generating a `packoffset` semantic to place the member manually, but was emitting it *after* the initializer of the original declaration: cbuffer C : register(b) { int a = -1 : packoffset(c0); } That syntax is invalid, of course, and we actually want: cbuffer C : register(b) { int a : packoffset(c0) = -1; } This wasn't spotted earlier because putting initializers on a `cbuffer` member is not commonly done, since it requires reading those values via the reflection API. Slang's reflection API currently provides no way to access default values like this, so they aren't of much use yet. Still, it is better to emit correct syntax even in cases like this one. 09 October 2017, 21:44:40 UTC
06b2192 Preprocessor: fix `undef` and redefinition (#204) * Preprocessor: fix `undef` and redefinition The logic for `undef` directives was failing to suppress macro expansion when reading the name to un-define, and so it wasn't actually working at all. We didn't notice this because we didn't have a test case, and users hadn't tried it. The logic for `define` had a similar bug, which meant that any attempt to define an already-defined macro would fail with a cryptic error, rather than raising the intended warning. Test cases have been added for both issues, along with the fixes. * fixup: add expected output for tests added 09 October 2017, 20:38:33 UTC
fe5eef4 Parser: fix precedence of cast expressions (#203) We were accidentally parsing this: (uint) a / b as this: (uint) ( a / b ) when it should be: ((uint) a) / b This is a bug that seems to have been inherited from a long time ago. It has taken a while to bite anybody because the only class of expressions it would hit are multiplicative ones, and in many cases the difference in the cast order won't be noticed for values in a limited range. 09 October 2017, 17:16:12 UTC
faf26af Perform some transformations on IR to legalize for GLSL (#200) When outputting GLSL from a Slang or HLSL entry point, we need to translate any parameters or results of an entry-point function into global declarations of `in` or `out` parameters, as needed by GLSL. This change adds these transformations at the IR level, so that they don't need to complicate the emit logic. More detailed changes: - Make `render0` test use IR. It passes out of the box. - Fix test runner to not always dump diffs on failures I accidentally initialized an option to `true` instead of `false` when working on debugging the Travis CI failures. - Special-case output for component-wise multiplication to handle GLSL `matrixCompMul()` - Handle GLSL vs. HLSL output for calls to `mul()` - Output proper `layout(std140)` on GLSL constant buffer declarations - Require appropriate GLSL extension when emitting explicit `layout(offset = ...)` on constant buffer members - TODO: Need to avoid requiring this extension in cases where the offsets are what would be computed anyway. Realistically, should probably be emitting code with explicit padding, etc. to guarantee layouts. - Add an IR-based pass to translate entry point functions by eliminating their input/output parameters and replacing them with global variables. - Demangle names when calling target intrinsics The lowering to the IR will turn a call like `sin(foo)` into a call to a function declaration with a mangled name like `_S3sin...`. This works fine when the user is calling their own functions, since the name mangling will apply to both the definition and use sites, but for builtin functions it obviously isn't what we want. This change makes it so that we demangle the name of an instrinsic function just enough so that we can extract the original simple name, and make a call using that. These changes do nor provide 100% of what we need when translating to GLSL, so the `cross-compile-entry-point` test *still* hasn't been flipped over to use the IR (even though that is the test case I've been using to develop these changes). 06 October 2017, 21:31:42 UTC
3693bff Attempt to fix subprocess handling for Linux (#197) * Attempt to fix subprocess handling for Linux Our CI builds are sometimes hanging on Travis, and I suspect it might be something to do with how we are waiting for subprocesses to complete. I'm trying to following the manpage for the `wait()` and `waitpid()` calls a bit better here. * fixup: try to use poll() instead of select() * fixup: missing header * fixup * fixup * fixup: try to emit test output when tests fail on Travis 06 October 2017, 15:43:47 UTC
33de715 Working on better handling of builtin functions in IR (#196) The main change I was working on here was to start having more of the builtin functions (in this case, `cos`, `sin`, and `saturate`) just lower to the IR as calls to builtin functions (with declarations but no definition), rather than expect/require them to map to individual IR opcodes in every case. The main change there was the removal of some `intrinsic_op` modifiers in the stdlib. This then requires the `isTargetInstrinsic` logic in IR-based code emit to avoid emitting declarations for these intrinsics. The corresponding logic for emitting *calls* to these intrinsics is currently being skipped. Along the way, a variety of fixups were added: - In order to support lowering to GLSL, we need to handle cases where a variable/function name uses a GLSL reserved word. The right long-term fix there is to always use generated or mangled names, but for now I'm hacking it by adding a `_s` prefix to all names during IR-based emit. - This needs a flag to disable it, since some of our tests currently rely on checking binding information from generated HLSL/SPIR-V that will include these mangled/modified names. - Emit matrix layout modifiers appropriately for GLSL - Specialize IR parameter-block emission between GLSL and HLSL - Fix up argument count/index logic for a couple of opcodes that weren't fixed when removing the types from the explicit operand list - Fix up IR generation for calls to declarations with generic arguments. We were briefly adding the generic args to the ordinary argument list, which added complexity in several places. We now rely on the declaration-reference nodes in the IR to carry that extra info. - TODO: We actually need to make sure that this is the case, since we don't currently correctly generated specialized decl-refs when building IR for function calls The main test that would have been affected by this is `cross-compile-entry-point`, but I was not able to get that working fully with the IR. The main problem in this case was that when emitting GLSL we will need to perform certain required transformations on the IR to get legal code for GLSL. Notably: - We need to hoist entry-point parameters away from being function parameters, and make them be global variables. This is currently being hand-waved during the emit logic, but it seems way better to have it all get cleaned up in the IR first. - We need to scalarize entry-point parameters, because structure input/output is not supported as vertex input or fragment output (and it may be best to always scalarize anyway, to match HLSL semantics). (Note: "scalarize" here means to bust up structures, but not matrices/vectors) 05 October 2017, 19:24:30 UTC
54f016e IR: overhaul IR design/implementation (#195) * IR: overhaul IR design/implementation Closes #192 Closes #188 This is a major overhaul of how the IR is implemented, with the primary goal of just using the AST-level type representation as the IR's type representation, rather than inventing an entire shadow set of types (as captured in issue #192). One consequence of this choice is that types in the IR are no longer explicit "instructions" and are not represented as ordinary operands (so a bunch of `+ 1` cases end up going away when enumerating ordinary operands). Along the way I also got rid of the embedded IDs in the IR (issue #188) because this wasn't too hard to deal with at the same time. Another related change was to split the `IRValue` and `IRInst` cases, so that there are values that are not also instructions. Non-instruction values are now used to represent literals, references to declarations, and would eventually be used for an `undef` value if we need one. IR functions, global variables, and basic blocks are all values (because they can appear as operands), but not instructions. The main benefit of this approach is that the top-level structure of a bytecode (BC) module is much simpler to understand and walk, and BC-level types are represented much more directly (such that we could conceivably use them for reflection soon). * fixup: 64-bit build fix * fixup: try to silence clang's pedantic dependent-type errors * fixup: bug in VM loading of constants 04 October 2017, 20:54:25 UTC
8a0ebb9 Get tests running/passing under Linux (#194) * Get tests running/passing under Linux - Fix up `dlopen` abstraction - Fix up some test cases to request hlsl (rather than default to dxbc) so they can run on non-Windows targets - Fix up test runner ignore tests that can't run on current platform (and not count those as failure) - Fix file handle leeak in process spawner absttraction - Get additional test-related applications building - More tweaks to Travis script; in theory deployment is set up now (yeah, right) * fixup * fixup: Travis environment variable syntax * fixup: Buffer->begin * fixup: actually run full tests on one config * fixup: add build status badge for Travis 29 September 2017, 20:43:08 UTC
74f2f47 First attempt at a Linux build (#193) * First attempt at a Linux build - Fix up places where C++ idioms were written assuming lenient behavior of Microsoft's compiler - Add a few more alternatives for platform-specific behavior where Windows was the only platform accounted for. - Add a basic Makefile that can at least invoke our build, even if it isn't going good dependency tracking, etc. - Build `libslang.so` and `slangc` that depends on it, using a relative `RPATH` to make the binary portable (I hope) - Add an initial `.travis.yml` to see if we can trigger their build process. * Fixup: const bug in `List::Sort` I'm not clear why this gets picked up by the gcc *and* clang that Travis uses, but not the (newer) gcc I'm using on Ubuntu here, but I'm hoping it is just some missing `const` qualifiers. * Fixup: reorder specialization of "class info" Clang complains about things being specialized after being instantiated (implicilty), and I hope it is just the fact that I generate the class info for the roots of the hierarchy after the other cases. We'll see. * Fixup: add `platform.cpp` to unified/lumped build * Fixup: Windows uses `FreeLibrary` and not `UnloadLibrary` * Fixup: fix Windows project file to include new source file This obviously points to the fact that we are going to need to be generating these files sooner or later. 27 September 2017, 18:17:39 UTC
b6cf0f4 Merge pull request #191 from tfoleyNV/ir-work More work on IR-based lowering and cross-compilation 25 September 2017, 18:27:54 UTC
0aa440a Fixup: deal with hitting `.obj` size limits for VS When using the lumped/"unity" build approach for Slang, the resulting `.obj` files run into number-of-sections limits in the VS linker. For now I'm using the `/bigobj` command-line flag to work around this for the `hello` example, just so I can be sure the lumped build still works, but longer term it seems like we need to just drop that approach anyway. The `render-test` application was switched to link against `slang.dll` since there is no reason to have multiple apps use the lumped approach. 25 September 2017, 16:35:34 UTC
4ea0c26 Fixup: typo in `core` library meta-code This seems to be a case where I edited the generated output file, which meant that its timestamp was newer than the input and it didn't trigger a rebuild, so I didn't notice the problem until the change went into CI (which did a fresh build). 25 September 2017, 15:42:37 UTC
1972fa3 More work on IR-based lowering and cross-compilation None of these changes are made "live" at the moment. I'm just trying to get them checked in to avoid divering too far from `master` at any point during development. - Add basic emit logic to produce GLSL from the IR in a few cases (the existing IR emit logic was ad hoc and HLSL-specific) - When lowering a function declaration, walk up its chain of parent declarations to collect additional parameters as needed - When lowering a call, make sure to add generic arguments that come from the declaration reference being called - Attach a "mangled name" to symbols when lowering, so that we can eventually use that name to resolve things for linkage. - After the above work, I had to apply some fixups to make sure that generic arguments *don't* get added when the user is calling an `__intrinsic_op` function, since those should map 1-to-1 down to instructions with just their ordinary parameter list. A big open question right now is whether I should continue to represent the generic arguments as just part of the ordinary argument list for a function, or split them out into separate `applyGeneric` and `apply` steps. A strongly related question is whether a declaration with generic parameters should lower into a single declaration, or one declaration nested inside an outer generic declaration. A good future step at this point would be to eliminate a lot of the `__intrinsic_op` stuff in favor of having the builtin functions include their own definitions, which might be in terms of a new expression-level construct for writing inline IR operations. This can't be done until the existing AST-to-AST path is no longer needed for cross-compilation purposes. More immediate next steps here: - We need a way to round-trip calls to external declaration that get handled by this mangled-name logic. Basically, if we are asked to output HLSL and we see a call to `_S...GetDimensions...(float4, t, a, ...)` we need to be able to walk the mangled name and get back to `t.getDimensions(a, ...)` without a whole lot of manual definitions to make things round-trip. - In the other case, where a declaration isn't built-in for the chosen target, we need to be able to load a module of target-specific definitions (which will somehow map back to symbols with certain mangled names) and then look these up (by mangled name) and then load/link/inline them into the user's IR to satisfy requirements in their code. 22 September 2017, 21:18:44 UTC
b206af7 Merge pull request #190 from tfoleyNV/use-warp-for-d3d11-tests Use WARP for D3D rendering tests. 21 September 2017, 18:20:57 UTC
289a71e Use WARP for D3D rendering tests. This should in principle allow for the D3D-only tests to run on AppVeyor so that we can validate running things for CI purposes (and is probably a whole lot easier than trying to plug the VM up to the rendering tests). I've switched up one of the tests so that it should run even on AppVeyor, so fingers crossed that it will actually run. 21 September 2017, 17:53:56 UTC
0116717 Initial work on a "VM" for Slang code (#189) At a high level, this commit adds two things: 1. A "bytecode" format for serializing Slang IR instructions and related structure (functions, "registers") 2. A virtual machine that can load and then execute code in that bytecode format. The reason for kicking off this work right now is that we *need* a way to run tests on Slang code generation that doesn't rely on having a GPU present (given that our CI runs on VM instances without GPUs), nor on textual comparison to the output of other compilers. With these features I've implemented a slapdash `slang-eval-test` test fixture that can run a (trivial) compute shader to very our compilation flow through to bytecode. Some key design constraints/challenges: - The bytecode format should be "position independent" so that a user can just load a blob of data and then inspect it without having to deserialize into another format, allocate memory, etc. Eventually the bytecode format might be a replacement for out current reflection API (we used to base reflection off a similar format, but the cost/benefit wasn't there at the time and we switched to just using the AST). - The VM should be able to execute bytecode functions without doing any per-operation translation, JIT, etc. (translation of more coarse-grained symbols is okay). For now the VM is just being used to run tests, but eventually I'd like it to be viable for: - Running Slang-based code in the context of the compiler itself. This starts with stuff like constant-folding in the front-end, but could expand to more general metaprogramming features. - Running Slang-based ocde within a runtime application (e.g., a game engine) that wants to be able to run things like "parameter shader" code, or even just evaluate compute-like code on CPU (e.g., when supporting particles on both CPU and GPU). - Finally, the bytecode format should ideally be able to round-trip back to the IR without unacceptable loss of information. This requirement and the previous one play off of each other, because things like a traditional SSA phi operation is ugly when you have to actually *execute* it. This doesn't matter right now when we don't have SSA yet, but it might be part of the decision-making here. The actual implementation is centralized in `bytecode.{h,cpp}` and `vm.{h.cpp}`. Big picture notes: - The space of opcodes is shared between IR and bytecode (BC), with the hope that this makes translation of operations between the two easy. - The actual bytecode instruction stream relies on a variable-length encoding for integer values, including opcodes and operand numbers, so that the common case is single-byte encoding. - In the long term I intend to have a rule that if you use a single-byte encoding for an opcode, then all operands are required to use single-byte encodings too. Operations that need multi-byte operands would then be forced to use a multi-byte encoding of the op, and would be sent down a slower path in the interpeter. - The "bytecode"'s outer structure is based on ordinary data structures linked with pointers, but they are "relative pointers" so the actual structure is position-independent. - There are two main kinds of operands: registers and "constants." An operand is a signed integer where non-negatie values indicate registers (with `index == operandVal`) and negative values indicate constants (with `index == ~operandVal`). - Registers are stored in the "stack frame" for a VM function call, and each has a fixed offset based on the size of the type and those that come before it. Conceptually, registers are allowed to overlap if they aren't live at the same time, and we manage this with a simple stack model: every register is supposed to identify the register that comes directly before it (this isn't implemented yet). - "Constants" are more realistically a representation of "captured" values, but they are currently also how constants come in. Basically we can use a compact range of indices in the bytecode for a function, and each of these indices indirectly refers to some value in the next outer scope. - The actual encoding of bytecode instructions right now is largely ad-hoc and very wasteful (we encode the type on everything, and we also encode everything as if it had varargs). - In some cases, an instruction needs to know the types of the values involved (e.g., because it needs to load an array element, which means copying a number of bytes based on the size). The way the VM works we have types attached to our registers, so we currently get sneaky and look at those types in some ops. Longer term is makes sense to encode the required type info directly in the BC. - There's a whole lot of hand-waving going on with how the actual top-level bytecode module gets loaded, because of the way we currently treat the top-level module as an instruction stream in the IR. This means that we try to represent the loaded module as a "stack frame" for a call to the module as a function, but that approach as serious problems, and isn't realistically what we want to do. 21 September 2017, 17:21:34 UTC
10b62ee IR: handle control flow constructs (#186) * IR: handle control flow constructs This change includes a bunch of fixes and additions to the IR path: - `slang-ir-assembly` is now a valid output target (so we can use it for testing) - This uses what used to be the IR "dumping" logic, revamped to support much prettier output. - A future change will need to add back support for less prettified output to use when actually debugging - IR generation for `for` loops and `if` statements is supported - HLSL output from the above control flow constructs is implemented - Revamped the handling of l-values, and in particular work on compound ops like `+=` - Add basic IR support for `groupshared` variables - Add basic IR support for storing compute thread-group size - Output semantics on entry point parameters - This uses the AST structures to find semantics, so its still needs work - Pass through loop unroll flags - This is required to match `fxc` output, at least until we implement unrolling ourselves. * Fixup: 64-bit build issues. * fixup for merge 14 September 2017, 22:37:05 UTC
8cdfce5 Merge pull request #184 from tfoleyNV/design-docs Add an initial design document on interfaces 13 September 2017, 19:36:53 UTC
5c45b09 Merge branch 'master' into design-docs 13 September 2017, 19:30:54 UTC
11bd663 Add an initial design document on interfaces This covers interfaces, generics, and associated types - hopefully with enough detail that we can start writing up example programs that we believe should compile. 13 September 2017, 19:27:31 UTC
3763ca9 Merge pull request #181 from tfoleyNV/ir-lowering-work Get IR working for `AdaptiveTessellationCS40/Render` test 12 September 2017, 18:25:16 UTC
2e0f8f2 Get IR working for `AdaptiveTessellationCS40/Render` test I had expected this to be the first case where control-flow instructions were needed, but it turns out that we aren't testing that entry point right now. The real work/fix here ended up being handling of the `row_major` layout qualifier on a matrix inside a `cbuffer`. The existing AST-based code was passing it through easily (although I don't believe it was handling the layout rules right). Getting it working in the IR involved beefing up the type-layout behavior so that it can handle explicit layout qualifiers (at least for matrix layout) which should also improve the layout computation for non-square matrices with nonstandard layout in the AST-based path. There are still some annoying things left to do: - The `row_major` and `column_major` layout qualifiers in HLSL/GLSL mean different things (well, they mean the reverse of one another) so I need to validate that the GLSL case is working remotely correctly. - The layout logic isn't handling other explicit-layout cases as supported by GLSL (but of course GLSL is a far lower priority than HLSL/Slang) - There is currently no way to pass in an explicit matrix layout flag to Slang to control the default behavior. - Any client who was using Slang for HLSL pass-through and then applying a non-default flag on their HLSL->* compilation will get nasty unexpected behavior when the IR goes live - and they are already dealing with nasty behavior around non-square matrices not matching layout between Slang and the downstream. - The logic that gleans layout modes from a variable declaration is currently only being applied for fields of structure types (which applies to `cbuffer` declarations as well), and not to global-scope uniform variables. - We need test cases for all of this. 12 September 2017, 16:56:41 UTC
b0b0763 Merge pull request #180 from tfoleyNV/ir-lowering-work Get another test working with IR codedgen 12 September 2017, 00:38:49 UTC
4644b64 Get another test working with IR codedgen - Add support for matrix types in IR/codegen - Add support for basic indexing operations in IR/codegen 11 September 2017, 22:27:08 UTC
e2de1ea Merge pull request #179 from tfoleyNV/ir-lowering-work Support IR-based codegen for a few more examples. 11 September 2017, 21:46:20 UTC
2055d54 Support IR-based codegen for a few more examples. The main interesting change here is around support for lowering of calls to "subscript" operations (what a C++ programmer would think of as `operator[]`). An important infrastructure change here was to add an explicit AST-node representation for a "static member expression" which we use whenever a member is looked up in a type as opposed to a value. The implementation of this probably isn't robust yet, but it turns out to be important to be able to tell such cases apart. 11 September 2017, 21:06:02 UTC
80fb7b0 Merge pull request #178 from tfoleyNV/boilerplate-generator Initial work on boilerplate code generator 11 September 2017, 17:27:41 UTC
14137cb Initial work on boilerplate code generator The goal here is to get the Slang "standard library" code out of string literals and into something a bit more like an actual code file. This is handled by having a `slang-generate` tool that can translate a "template" file that mixes raw Slang code (or any language we want to generate...) with generation logic that is implemented in C++ (currently). This work isn't final by any stretch of the imagination, but it moves a lot of code and not merging it ASAP will complicate other changes. My expectation is that the generator tool will be beefed up on an as-needed basis, to get our stdlib code working. Similarly, the stdlib code does not really take advantage of the new approach as much as it could. That is something we can clean up along the way as we do modifications of the stdlib. 11 September 2017, 16:50:56 UTC
0e566a6 Merge pull request #177 from tfoleyNV/ir-work Replace old notion of "intrinsic" operations 07 September 2017, 17:31:37 UTC
ced92a0 Fixup: fix uninitialized memory bug This is a bug that already existed in the IR code, but wasn't getting triggered on existing test cases, and only seems to affect the 64-bit target right now. The "key" value being constructed to cache and re-use constants during IR generation wasn't actually being fully initialized, and so garbage values in it could cause the lookup of an existing value to fail where it should succeed. 07 September 2017, 16:50:50 UTC
ad35395 Replace old notion of "intrinsic" operations The code previously had an enumerated type for "intrinsic" operations, and allowed functions to be marked `__intrinsic_op(...)` to indicate the operation they map to. The nature of the IR meant that each of these intrinsic ops had to have a corresponding IR opcode, but the `enum` types weren't the same. This change cleans things up a bit by deciding that the `__intrinsic_op(...)` modifier names an actual IR opcode, and so the `IntrinsicOp` enum is gone. The biggest source of complexity here is that there are certain operations that need to be "intrinsic"-ish for the purposes of the current AST-based translation path, because we need them to round-trip from source to AST and back. Right now this is being handled by defining a bunch of "pseudo-ops" which can be used in the `__intrinsic_op` modifier, but which are *not* meant to be represented in the IR. Currently I don't actually handle this during IR generation. In the long run, once we are using IR for everything that needs cross-compilation, we should be able to eliminate the pseudo-ops in favor of just having these be ordinary (inline) functions defined in the stdlib (e.g., the `+=` operator can just have a direct definition). There was a second category of modifier that gets a little caught up in this, which is the `__intrinsic` modifier, which got used in two ways: 1. A function marked `__intrinsic(glsl, ...)` had what I call a "target intrinsic" modifier, which specified how to lower it for a specific target (e.g., GLSL). 2. A function just marked `__intrinsic` was supposed to be a marker for "this function shouldn't be emitted in the output, because the implementation is expected to be provided" The latter category of function should really be an `__intrinsic_op`, so I translated all those uses. I added a tiny bit of sugar so that `__intrinsic_op` without an explicit opcode will look up an opcode based on the name of the function being called, so that an operation like `sin` can automatically be plumbed through to an equivalent IR op. (The first category is a stopgap for the AST-based cross-compilation, and will hopefully be replaced by something better as we get the IR-based path working). Getting the switch from `__intrinsic` to `__intrinsic_op` working required shuffling around some code in `emit.cpp` that handles looking up those modifiers and emitting builtin operations appropriately during cross-compilation. Depending on where we go with things, a possible extension of this approach is to allow multiple operands to `__intrinsic_op` so that the first specifies the opcode, and then the rest are literal arguments to specify "sub-ops." This could help us handle stuff like texture-fetch operations without an explosion in the number of opcodes. I still need to think about whether this is a good idea or not. 07 September 2017, 16:16:41 UTC
ca16ede Merge pull request #176 from tfoleyNV/ir-work Continue work on IR-based codegen 06 September 2017, 21:15:11 UTC
5900f32 Continue work on IR-based codegen This gets us far enough that we can convert a single test case to use the IR, under the new `-use-ir` flag. Getting this merged into mainline will at least ensure that we keep the IR path working in a minimal fashion, even when we have to add functionality the existing AST-based path There is definitely some clutter here from keeping both IR-based and AST-based translation around, but I don't want to have a long-lived branch for the IR that gets further and further away from the `master` branch that is actually getting used and tested. Summary of changes: - Add pointer types and basic `load` operation to be able to handle variable declarations - Add basic `call` instruction type - Add simple address math for field reference in l-value - Always add IR for referenced decls to global scope - Add notion of "intrinsic" type modifier, which maps a type declaration directly to an IR opcode (plus optional literal operands to handle things like texture/sampler flavor) - Improve printing of IR instructions, types, operands - Add constant-buffer type to IR - Allow any instruction to be detected as "should be folded into use sites" and use this to tag things of constant-buffer type - Also add logic for implicit base on member expressions, to handle references to `cbuffer` members - Add connection back to original decl to IR variables (including global shader parameters...) - Use reflection name instead of true name when emitting HLSL from IR (so that we can match HLSL output) - Make IR include decorations for type layout - Re-use existing emit logic for HLSL semantics to output `register` semantics for IR-based code - Make IR-based codegen be an option we can enable from the command line - It still isn't on by default (it can barely manage a trivial shader), but it seems better to enable it always instead of putting it under an `#ifdef` - Fix up how we check for intrinsic operations suring AST-based cross compilation so that adding new intrinsic ops for the IR won't break codegen. 06 September 2017, 20:52:04 UTC
e59a1b3 Merge pull request #175 from tfoleyNV/implicit-conversion Move implicit conversion operations to stdlib 05 September 2017, 20:16:21 UTC
8b91fb8 Add a basic test case for HLSL implicit conversions This at least tries to capture some of the basic rules about what implicit conversions are allowed. 05 September 2017, 19:51:33 UTC
ff7c190 Move implicit conversion operations to stdlib - Previously, there were a variety of rules in `check.cpp` to pick the conversion cost for various cases involving scalar, vector, and matrix types. - The main problem of the previous approach is that any lowering pass would need to convert an arbitrary "type cast" node into the right low-level operation(s). - The new approach is that a type conversion (implicit or explicit) always resolves as a call to a constructor/initializer for the destination type. This means that the existing rules around marking operations as builtins should work for lowering. - The support this, the checking logic needs to perform lookup of intializers/constructors when asked to perform conversion between types. It does this by re-using the existing logic for lookup and overload resolution if/when a type was applied in an ordinary context. - Next, we define a modifier that can be attached to constructors/initializers to mark them as suitable for implicit conversion, and associate them with the correct cost to be used when doing overload comparisons. - We add the modifier to all the scalar-to-scalar cases in the stdlib, using the logic that previously existed in semantic checking. - Next we add cases for general vector-to-scalar conversions that also convert type, using the same cost computation as above. - This probably misses various cases, but at this point they can hopefully be added just in the stdlib. - One gotcha here is that in lowering, we need to make sure to lower any kind of call expression to another call expression of the same AST node class, so that we don't lose information on what casts were implicit/hidden in teh source-to-source case. Two notes for potential longer-term changes: 1. There is still some duplication between the type conversion declarations here and the "join" logic for types used for generic arguments. Ideally we'd eventually clean up the "join" logic to be based on convertability, but that isn't a high priority right now, as long as joins continue to pick the right type. 2. It is a bit gross to have to declare all the N^2 conversions for vector/matrix types to duplicate the cases for scalars. For the simple scalar-to-vector case, we might try to support multiple conversion "steps" where both a scalar-to-scalar and a scalar-to-vector step can be allowed (this could be tagged on the modifiers already introduced). That simple option doesn't scale to vector-to-vector element type conversions, though, where you'd really want to make it a generic with a constraint like: vector<T,N> init<U>(vector<U,N> value) where T : ConvertibleFrom<U>; Here the `ConvertibleFrom<U>` interface expresses the fact that a conforming type has an initializer that takes a `U`. What doesn't appear in this context is any notion of conversion costs. We'd need some kind of system for computing the conversion cost of the vector conversion from the cost of the `T` to `U` converion. 05 September 2017, 19:15:28 UTC
6207340 Merge pull request #174 from tfoleyNV/modifier-lowering-fix Fix some issues around cloned modifiers. 31 August 2017, 18:08:54 UTC
60ed105 Fix some issues around cloned modifiers. The root of the problem here is that: - We do a shallow copy of modifiers when "lowering" declarations/statements, by just copying over the head pointer of the linked list of modifiers - During lowering we sometimes add additional modifiers (only used during lowering), and these can thus accidentally get added to the end of the list of modifiers for the original declaration (rather than just the lowered decl) - If the same declaration is used by multiple entry points to be output, then a modifier added by the first entry point (which could reference entry-point-specific storage) will be earlier in the modifier list and might be picked up by a later entry point, so that we dereference already released memory The simple fix for right now is the use the support for a "shared" modifier node to ensure that each lowered declaration/statement gets a unique modifier list. A better long-term fix is: 1. Don't use modifiers to store general side-band information, and instead use proper lookup tables that own their contents. 2. Don't use a linked list to store modifiers (this was done to make splicing easy, but now we have a whole class of bugs related to bad splices), and be willing to clone them as needed. 31 August 2017, 17:41:03 UTC
227f9f5 Merge pull request #173 from tfoleyNV/resources-in-structs-fixes Fix some resources-in-structs bugs 26 August 2017, 03:13:19 UTC
c077a08 Fixup for test failure AppVeyor has a different version of fxc installed by default, and it produces subtly different output for this test case. It seems like later versions are clever enough to completely eliminate an empty `cbuffer` declaration, but earlier versions aren't. I'm actually not entirely sure why Slang is successfully eliminating the cbuffer as well, but the output DXBC implies it was not generated. 26 August 2017, 02:21:55 UTC
7c35d9e Try to print failure from AppVeyor tests 26 August 2017, 02:15:18 UTC
e62f905 Fixup: handle splice of multiple modifiers I changed the logic so that it might splice a new modifier into the existing linked list (not just at the end), but failed to account for the case where what we are splicing in isn't just a single modifier, but a whole *list* (which occurs when splicing in the shared modifiers themselves). This change handles that case with a bit of annoying linked-list cleverness. 26 August 2017, 01:55:49 UTC
8c68434 Fix some resources-in-structs bugs Fixes #171 Fixes #172 These two bugs related to bad logic in handling of splitting resource-containing `cbuffer` declarations. - Issue #171 was the case where a `cbuffer` *only* had resource fields, in which case we crashed whenever referencing any field (some code was assuming there had to be non-resource fields) - Issue #172 was a case where two fields were declared with a single declaration (`Texture2D a, b;`), and the logic we had for tracking resource-type fields was accidentally tagging *both* fields with a single modifier so that field `b` would get confused for `a` in some contexts, and attempts to access `b` would crash. Both issues are now fixed, and regression tests have been added. 25 August 2017, 22:48:49 UTC
7e05e06 Merge pull request #170 from tfoleyNV/ir Add some dummy logic to print IR to HLSL 17 August 2017, 22:22:24 UTC
ec8175c [ir] Add support for "decorations" on instructions The terminology here is similar to SPIR-V. For right now the only decoration exposed is a fairly brute-force one that just points back to a high-level declaration so that we can look up info on it that might affect how we print output. 17 August 2017, 21:51:09 UTC
95348fd [ir] Represent fields more direclty Previously, a `StructType` was an ordinary instruction that took a variable number of types are operands, representing the types of fields. This ends up being inconvenient for a few reasons: - To add decorations to the fields, you'd end up having to decorate the struct type instead (SPIR-V has this problem) - You need to compute field indices during lowering, when you might prefer to defer that until later - The get/set field operations now need an index, which needs to be an explicit operand, which means a magic numeric literal floating around to represent the index The new approach fixes for the first two of these, and at least makes the last one a bit nicer. A `StructDecl` is now a parent instruction, and its sub-instructions represent the fields of the type - each field is an explicit instruction of type `StructField`. The operation to extract a field takes a direct reference the struct field, so everything is quite explicit. 17 August 2017, 21:17:29 UTC
ad19484 Add some dummy logic to print IR to HLSL - Change IR instructions to just hold an integer opcode instead of a pointer to the "info" structure - Externalize definition of IR instructions to a header file, and use the "X macro" approach to allow generating different definitions - Add notion of function types to the IR, so that we can easily query the result type of a function - Add some convenience accesors to allow walking the IR in a strongly-typed manner (e.g., iterate over the parameters of a function) - TODO: these should really be changed to assert the type of things, as least in debug builds - Add very basic logic to `emit.cpp` so that it can walk the generated IR and start printing it back as HLSL - This isn't meant to be usable as-is, but it is a step toward where we need to go 17 August 2017, 20:44:02 UTC
1965c3f Merge pull request #169 from tfoleyNV/ir IR generation cleanup work 17 August 2017, 19:18:37 UTC
5de3003 IR generation cleanup work - Make all instructions store their argument count for now, so we can iterate over them easily. - Longer term we might try to optimize for space because the common case is that the operand count is known, but keeping it simpler seems better for now - Split apart the creation of an instruction from adding it to a parent - Use the above capability to make sure that we add a function to its parent *after* all the parameter/result type emission has occured. - Perform simple value numbering for types during IR creation - This logic also tries to pick a good parent for any type instructions, so that types don't get created local to a function unless they really need to - Create all constants at global scope, and re-use when values are identical 17 August 2017, 18:22:22 UTC
5230ad2 Merge pull request #168 from tfoleyNV/resources-in-structs-control Add a flag to control type splitting 17 August 2017, 17:58:11 UTC
d13bd05 Add a flag to control type splitting The `-split-mixed-types` flag can be provided to command-line `slangc`, and the `SLANG_COMPILE_FLAG_SPLIT_MIXED_TYPE` flag can be passed to `spSetCompileFlags`. Either of these turns on a mode where Slang will split types that included both resource and non-resource fields. The declaration of such a type will just drop the resource fields, while a variable declare using such a type turns into multiple declararations: one for the non-resource fields, and then one for each resource field (recursively). This behavior was already implemented for GLSL support, and this change just adds a flag so that the user can turn it on unconditionally. Caveats: - This does not apply in "full rewriter" mode, which is what happens if the user doesn't use any `import`s. I could try to fix that, but it seems like in that mode people are asking to bypass as much of the compiler as possible. - When it *does* apply, it applies to user code as well as library/Slang code. So this will potentially rewrite the user's own HLSL in ways they wouldn't expect. I don't see a great way around it, though. 17 August 2017, 15:59:30 UTC
3dd88c2 Merge pull request #167 from tfoleyNV/ir More work on IR 16 August 2017, 23:25:32 UTC
e30ba2f Fixups for IR checkpoint - The changes introduced a new path where we don't even go through the current "lowering" (really an AST-to-AST legalization pass), but this exposed a few issues I didn't anticipate: - First, we needed to make sure to pass in the computed layout information when emitting the original program (since the layout info is no longer automatically attached to AST nodes) - Second, we needed to take the sample-rate input checks that were being done in lowering before, and move them to the emit logic (which is really ugly, but I don't see a way around it for GLSL). 16 August 2017, 23:04:09 UTC
87c5067 More work on IR With this change, basic generation of IR works for a trivial shader, and there is some basic support for dumping the generated IR in an assembly-like format. As with the other IR change, the use of the IR is statically disabled for now, so that existing users won't be affected. 16 August 2017, 19:44:50 UTC
74e04b8 Merge pull request #166 from tfoleyNV/add-builtins-to-core Add user-defined builtins to the "core" module 15 August 2017, 19:51:38 UTC
6de0a48 Merge pull request #165 from tfoleyNV/ir Starting to add intermediate representation (IR) 15 August 2017, 19:51:14 UTC
d094bfc Add user-defined builtins to the "core" module The Slang API allows an expert user to feed in source code that the compiler then treats as if it came from the Slang "standard library." They can use this to introduce new builtin types, functions, etc. - so long as they are careful, and are willing to deal with the lack of any compatibility guarantees across versions. At some point I split the Slang standard library into distinct modules, so that GLSL and HLSL builtins wouldn't pollute each other's namespace. In that change, I had to decide what module any new user-defined builtins should get added to, and I apparently decided they should go into the module for the Slang language, which would only affect `.slang` files. This doesn't work at all if the user wants to declare new HLSL builtins. I've gone ahead and made user extensions add to the "core" module (which is used by all of HLSL, GLSL, and Slang), but a better long-term fix would be to let the user pick the module/language the new builtins should apply to. 15 August 2017, 19:46:51 UTC
831896f Starting to add intermediate representation (IR) Right now none of this is hooked up, but I want to get things checked in incrementally rather than have along long-lived branches. - Added placeholder declarations for IR representation of instructions, basic blocks, etc. - Start adding a `lower-to-ir` pass to translate from AST representation to IR Again: none of this is functional, so it shouldn't mess with existing users of the compiler. 15 August 2017, 19:21:08 UTC
e6abc68 Merge pull request #162 from tfoleyNV/gh-38 Improve diagnostics for overlapping/conflicting bindings 15 August 2017, 18:09:03 UTC
f64bbf7 Add missing expected output file for test. 15 August 2017, 17:07:49 UTC
4a94f39 Improve diagnostics for overlapping/conflicting bindings Closes #38 - Change overlapping bindings case from error to warning (it is *technically* allowed in HLSL/GLSL) - Make diagnostic messages for these cases include a note to point at the "other" declaration in each case, so that user can more easily isolate the problem - Unrelated fix: make sure `slangc` sets up its diagnostic callback *before* parsing command-line options so that error messages output during options parsing will be visible - Unrelated fix: make sure that formatting for diagnostic messages doesn't print diagnostic ID for notes (all have IDs < 0). - Note: eventually I'd like to not print diagnostic IDs at all (I think they are cluttering up our output), but doing that requires touching all the test cases... 15 August 2017, 16:33:36 UTC
4a9b281 Merge pull request #161 from tfoleyNV/gh-160 Handle possibility of bad types in varying input/output signature. 15 August 2017, 16:28:50 UTC
3a2f191 Handle possibility of bad types in varying input/output signature. Fixes #160 If the front-end runs into a type it doesn't understand in the parameter list of an entry point, it will create an `ErrorType` for that parameter, but then the parameter binding/layout rules will fail to create a `TypeLayout` for the prameter (and return `NULL`). There were some places where the code was expecting that operation to succeed unconditionally, and so would crash when there was a bad type. The specific case in the bug report was when the return type of a shader entry point was bad: // `vec4` is not an HLSL type vec4 main(...) { ... } Note that the specific case in the buf report only manifests in "rewriter" mode (when the Slang compiler isn't allowed to issue error messages from the front-end), but the same basic thing would happen if the varying parameter/output had used a type that is invalid for varying input/output: Texture2D main(...) { ... } I'm not 100% happy with just adding more `NULL` checks for this, because there is no easy way to tell if they are exhaustive. A better solution in the longer term might be to construct a kind of `ErrorTypeLayout` to represent cases where we wanted a type layout, but none could be constructed. 15 August 2017, 15:52:13 UTC
aeb247c Merge pull request #159 from tfoleyNV/name-type Name type 15 August 2017, 01:50:46 UTC
9885c97 Add an explicit `Name` type Fixes #23 Up to this point, the compiler has used the ordinary `String` type to represent declaration names, which means a bunch of lookup structures throughout the compiler were string-to-whatever maps, which can reduce efficiency. It also means that things like the `Token` type end up carying a `String` by value and paying for things like reference-counting. This change adds a `Name` type that is used to represent names of variables, types, macros, etc. Names are cached and unique'd globally for a session, and the string-to-name mapping gets done during lexing. From that point on, most mapping is from pointers, which should make all the various table lookups faster. More importantly (possibly), this brings us one step closer to being able to pool-allocate the AST nodes. 14 August 2017, 21:48:37 UTC
7f57ea4 Rename `Name` fields to `name` This is in preparation for using `Name` as a type name. 14 August 2017, 15:02:03 UTC
bb66d6e Merge pull request #158 from tfoleyNV/syntax-lookup Data-driven parsing of modifiers 12 August 2017, 23:14:48 UTC
f1bda6a Data-driven parsing of modifiers Just like the previous change did for declaration keywords, this change uses the lexical environment to drive the lookup and dispatch of modifier parsing. This allows us to easily add modifiers to Slang, even when they might conflict with identifiers used in user code (because the modifier names are no longer special keywords, but ordinary identifiers). There was already some support for ideas like this with `__modifier` declarations (`ModifierDecl`) used to introduce some GLSL-specific keywords (so that they wouldn't pollute the namespace of HLSL files). The new approach changes these to be actual `syntax` declarations (`SyntaxDecl`) with the same representation as those used to introduce declaration keywords. Because many modifiers just introduce a single keyword that maps to a simple AST node (no further tokens/data), I modified the handling of syntax declarations so that they can take a user-data parameter, and this allows the common case ("just create an AST node of this type...") to be handled with minimal complications. This also adds in a general-purpose string-based lookup path for AST node classes, that should support programmatic creation in more cases. Statements are now the main case of keywords that need to be made table driven. 12 August 2017, 21:50:01 UTC
495f881 Merge pull request #157 from tfoleyNV/syntax-lookup Look up declaration keywords using ordinary scoping. 11 August 2017, 21:04:24 UTC
b2febc7 Look up declaration keywords using ordinary scoping. The existing parser code was doing string-based matching on the lookahead token to figure out how to parse a declaration, e.g.: ``` if(lookAhead == "struct") { /* do struct thing */ } else if(lookAhead == "interface") { /* do interface thing * } ... ``` That approach has some annoying down-sides: - It is slower than it needs to be - It is annoying to deal with cases where the available declaration keywords might differ by language - Most importantly, it is not possible for us to introduce "extended" keywords that the user can make use of, but which can be ignored by the user and treated as an ordinary identifier. That last part is important. Suppose the user wanted to have a local variable named `import`, but we also had a Slang extension that added an `import` keyword. Then a line of code like `import += 1` would lead to a failure because we'd try to parse an import declaration, even when it is obvious that the user meant their local variable. This would mean that Slang can't parse existing user code that might clash with syntax extensions. This issue is the reason why we currently have keywords like `__import`. A traditional solution in a compiler is to map keywords to distinct token codes as part of lexing, which eliminates the first conern (performance) because now we can dispatch with `switch`. It can also aleviate the second concern if we add/remove names from the string->code mapping based on language (the rest of the parsing logic doesn't have to know about keywords being added/removed). The solution we go for here is more aggressive. Instead of mapping keyword names to special token codes during lexing, we instead introduce logical "syntax declarations" into the AST, which are looked up using the ordinary scoping rules of the language. Depending on what code is imported into the scope where parsing is going on, different keywords may then be visible. This solves our last concern, since a user-defined variable that just happens to use the same name as a keyword is now allowed to shadow the imported declaration for syntax (this is akin to, e.g., Scheme where there really aren't any "keywords"). This also opens the door to the possibility of eventually allowing user to define their own syntax (again, like Scheme). For now I'm only using this for the declaration keywords. With this change it should be pretty easy to also add statement keywords in the same fashion. 11 August 2017, 19:34:58 UTC
db4079f Merge pull request #156 from tfoleyNV/source-file-cleanup Make source location lightweight 10 August 2017, 22:25:04 UTC
a5a436c Make source location lightweight Fixes #24 So far the code has used a representation for source locations that is heavy-weight, but typical of research or hobby compilers: a `struct` type containing a line number and a (heap-allocated) string. This is actually very convenient for debugging, but it means that any data structure that might contain a source location needs careful memory management (because of those strings) and has a tendency to bloat. The new represnetation is that a source location is just a pointer-sized integer. In the simplest mental model, you can think of this as just counting every byte of source text that is passed in, and using those to name locations. Finding the path and line number that corresponds to a location involves a lookup step, but we can arrange to store all the files in an array sorted by their start locations, and do a binary search. Finding line numbers inside a file is similarly fast (one you pay a one-time cost to build an array of starting offsets for lines). More advanced compilers like clang actually go further and create a unique range of source locations to represent a file each time it gets included, so that they can track the include stack and reproduce it in diagnostic messages. I'm not doing anything that clever here. 10 August 2017, 20:05:04 UTC
6e4830f Merge pull request #155 from tfoleyNV/renaming Major naming overhaul: 09 August 2017, 18:20:09 UTC
8e4c0c3 Merge pull request #154 from tfoleyNV/fix-lowering Fix use of "pseudo-syntax" in current lowering pass 09 August 2017, 17:23:29 UTC
695c270 Major naming overhaul: - `ExpressionSyntaxNode` becomes `Expr` - `StatementSyntaxNode` becomes `Stmt` - `StructSyntaxNode` becomes `StructDecl` - `ProgramSyntaxNode` becomes `ModuleDecl` - `ExpressionType` becomes `Type` - Existing fields names `Type` become `type` - There might be some collateral damage here if there were, e.g., `enum`s named `Type`, but I can live with that for now and fix those up as a I see them 09 August 2017, 17:23:09 UTC
a728612 Fix use of "pseudo-syntax" in current lowering pass The so-called "lowering" pass (really a kind of AST-to-AST legalization pass right now) needs to handle some basic scalarization of structured types, and it does this by inventing what I call "pseuo-expressions" and "pseudo-declarations." For example, there is a pseudo-expression node type that represents a tuple of N other expressions, and certain operations act element-wise over such tuples. The problem was that the implementation introduced these out-of-band expression/declaration types into the existing AST hierarchy which led to a dilemma: - If these new AST nodes were declared like all the others (and integrated into the visitor dispatch approach, etc.) then every pass would need to deal with them even though they are meant to be a transient implementation detail of this one pass - But if the new nodes *aren't* declared like the others, then they can't meaningfully interact with visitor dispatch, and will just crash the compiler if they somehow "leak" through to latter passes. And because they are just ordinary AST nodes from a C++ type-system perspective, such leaking is entirely possible (if not probable) Hopefully that setup helps make the solution clear: instead of having the "lowering" pass map an expression to an expression, it needs to map an expression to a new data type (here called `LoweredExpr`) that can wrap *either* an ordinary expression (the common case) or one of the new out-of-band values. Any code that accepts a `LoweredExpr` needs to handle all the cases, or explicitly decide that it can't/won't deal with anything other than ordinary expressions. Most of the code changes are straightforward at that point, although the whole "lowering" approach is a bit fiddly right now, so gertting the tests passing took a bit of attention. I'm not sure our test coverage of all this code is great, so I wouldn't be surprised if some failures are lurking still. 09 August 2017, 16:59:50 UTC
9ad2b40 Merge pull request #153 from tfoleyNV/remove-globals Remove uses of global variables 07 August 2017, 22:44:00 UTC
7b54f43 Remove uses of global variables There were two main places where global variables were used in the Slang implementation: 1. The "standard library" code was generated as a string at run-time, and stored in a global variable so that it could be amortized across compiles. 2. The representation of types uses some globals (well, class `static` members) to store common types (e.g., `void`) and to deal with memory lifetime for things like canonicalized types. In each case the "simple" fix is to move the relevant state into the `Session` type that controlled their lifetime already (the `Session` destructor was already cleaning up these globals to avoid leaks). For the standard library stuff this really was easy, but for the types it required threading through the `Session` a bit carefully. One more case that I found: there was a function-`static` variable used to generate a unique ID for files output when dumping of intermediates is enabled (this is almost strictly a debugging option). Rather than make this counter per-session (which would lead to different sessions on different threads clobbering the same few files), I went ahead and used an atomic in this case. Note that the remaining case I had been worried about was any function-`static` counter that might be used in generating unique names. It turns out that right now the parser doesn't use such a counter (even in cases where it probably should), and the lowering pass already uses a counter local to the pass (again, whether or not this is a good idea). This change should be a major step toward allowing an application to use Slang in multiple threads, so long as each thread uses a distinct `SlangSession`. The case of using a single session across multiple threads is harder to support, and will require more careful implementation work. 07 August 2017, 22:16:54 UTC
ca8eea9 Update README.md Typo fix. 27 July 2017, 16:49:49 UTC
0045a63 Merge pull request #149 from tfoleyNV/docs Update documentation. 26 July 2017, 22:52:40 UTC
92463c8 Update documentation. - Update readme to fill out some of the `TODO` sections - Add an API user's guide that gives the basics of linking against Slang and using it to compile and reflect shaders - Add a bit of usage info for the command-line `slangc` program - Add an overview of the Slang language as it stands today - Add an initial FAQ, mostly to help answer the "why should I use this?" question 26 July 2017, 21:56:45 UTC
174bc66 Merge pull request #148 from shader-slang/add-code-of-conduct Add Code of Conduct 26 July 2017, 21:42:19 UTC
213d609 Create CODE_OF_CONDUCT.md 26 July 2017, 21:41:11 UTC
155693a Merge pull request #146 from tfoleyNV/output-path-option Add a `-o` option to command-line `slangc` 26 July 2017, 01:22:12 UTC
back to top