https://github.com/shader-slang/slang
Revision 9a2a35f9282570ed075ebc2313f4aa9ed7a643ce authored by Tim Foley on 28 August 2020, 18:21:40 UTC, committed by GitHub on 28 August 2020, 18:21:40 UTC
Most people agree that it is a Good Thing when compilers are deterministic: the exact same input bits produce the exact same output bits every time the compiler is run. Bonus points are awarded if the results are independent of the platform the compiler was compiled for and run on.

One of the easiest kinds of nondeterminism to have sneak into a compiler is for it to produce the "same" code inside functions, but sometimes emits functions or other global symbols in a different order from run to run. Right now, the Slang compiler has some of this kind of nondeterminism.

The main way (but not necessarily the only way) that a compiler ends up producing output with a different ordering across runs is by iterating over the contents of a hash-based container (in our codebase, a `Dictionary` or `HashSet`), where the keys make use of pointers. Most operating systems intentionally try to randomize the address space of processes across runs (as a security feature), so that exact pointer values are not stable across runs, and thus hash value are not stable across runs, and thus the ordering of entries is not stable across runs.

This change identifies a few cases of iterating over dictionaries or sets that could have produced output non-determinism:

* The `HLSLIntrinsicSet` was using a `Dictionary` to store intrinsics that had been referenced, and would later produce a linear list of those intrinsics based on their order in the dictionary.

* The `WitnessTable`s produced by the front-end stored a `Dictionary` or requirements, and lowering from AST->IR was iterating over that dictionary to ensure that everythign got emitted.

* The `SharedSemanticsContext` was tracking a `HashSet` of modules that were imported into scope (so that their `extension`s should be visible), and an iteration over that list was used when producing candidate extensions during lookup. This case is unlikely to cause any nondeterminism in final output, but could lead to nondeterministic ordering in diagnostic messages for ambiguous reference/overload cases.

* The IR linker maintains a `Dictionary` of symbols based on their mangled names, and iterates over it in code that clones all witness tables into the linked IR whether or not they are referenced.

For most of these cases the fix is simple:

* Keep both a `Dictionary`/`HashSet` and a `List` of the appropriate type
* Whenever adding to the hash-based container also add to the list
* Whenever iterating, iterate over the list

In the final case of the IR linker, the relevant code was marked with a `TODO` comment noting that it shouldn't actually be needed, so I simply dropped it and the change doesn't seem to break any of our tests. I've been fairly confident that code wasn't needed for a while.

This change isn't exactly elegant, and a better long term solution might be to introduce two new types, `OrderedDictionary` and `OrderedSet`, which are similar to `Dictionary` and `HashSet` except that they guarantee a deterministic order of enumeration of their contents, based on insertion order.

(Note that a `SortedDictionary` and/or `SortedSet` that use something like a binary tree to produce a "determinsitc" sorted order wouldn't actually help here, because sorting entries by pointer values wouldn't solve the underlying problem that the pointer values aren't stable across runs)

I've chosen to avoid adding new types to `core` in the interest of making the change as small as possible. If we all agree that new types are warranted, it should be easy to clean up these use cases.

Testing this change is difficult, because we can't produce a reliable test to rule out nondeterminism. I have done best-effort testing by hand by crafting shaders that show output nondeterminism, and then compiling them both with and without these changes.
1 parent ab5b0a7
History
Tip revision: 9a2a35f9282570ed075ebc2313f4aa9ed7a643ce authored by Tim Foley on 28 August 2020, 18:21:40 UTC
Avoid nondeterministic ordering of output (#1522)
Tip revision: 9a2a35f
File Mode Size
docs
examples
external
extras
prelude
source
tests
tools
.editorconfig -rw-r--r-- 937 bytes
.gitattributes -rw-r--r-- 95 bytes
.gitignore -rw-r--r-- 759 bytes
.gitmodules -rw-r--r-- 774 bytes
.travis.yml -rw-r--r-- 2.1 KB
CODE_OF_CONDUCT.md -rw-r--r-- 3.1 KB
LICENSE -rw-r--r-- 1.1 KB
README.md -rw-r--r-- 7.4 KB
appveyor.yml -rw-r--r-- 4.0 KB
premake.bat -rw-r--r-- 120 bytes
premake5.lua -rw-r--r-- 38.6 KB
slang-com-helper.h -rw-r--r-- 4.8 KB
slang-com-ptr.h -rw-r--r-- 4.8 KB
slang-tag-version.h -rw-r--r-- 36 bytes
slang.h -rw-r--r-- 129.5 KB
slang.sln -rw-r--r-- 13.7 KB
test.bat -rw-r--r-- 1.4 KB
travis_build.sh -rw-r--r-- 460 bytes
travis_test.sh -rw-r--r-- 435 bytes

README.md

back to top