Revision c3882a5a00025b113eeb40385dad0b5aeee271b6 authored by Dillon Sharlet on 11 March 2021, 23:14:36 UTC, committed by GitHub on 11 March 2021, 23:14:36 UTC
* Implement sliding window warmups by backing up the loop min.

* Fix indirect sliding windows.

* Improve is_monotonic.

* Small cleanups.

* Avoid generating vector valued bounds.

* Fix build error on some compilers.

* Fix loop bounds.

* Don't try to slide things that should just be compute_at the store_at location.

* Print condition when printing boxes.

* Less things broken.

* Add/fix comments.

* Comments

* Fix async by moving if inside consume (and so inside acquires).

* Fix division.

* This doesn't work on master either.

* Add TODO

* Acquire is not a no-op.

* Add comment about unfortunate simplification.

* Remove debug(0)

* Add simplification of for { acquire { noop } }

* Fix folding factors finally!

* Update storage_folding test.

* Fix bug when cloning a semaphore used more than once.

* Disable failing test.

* Work around bad complexity in is_monotonic.

* Fix sub bug

* Significantly faster schedule for blur.

* Update tracing test.

* New simplifications that help with upsampled and downsampled sliding windows.

* This doesn't need explicit folding any more.

* Fix new simplifier rules.

* Fix simplifier div rule

* Remove ancient brittle test.

* Fix simplify rule again

* More LT -> EQ rules for mod

* Fix nested sliding windows with upsamples.

* Replace hack with better solution.

* Add missing override

* Don't rewrite loop variable if the min doesn't change.

* Refactor sliding window lowering.

* Fixed bounds growing redundantly for independent producers.

* Don't take the union unless possibly needed.

* Respect conditional provide/required.

* Add missing overrides

* Much better schedule.

* Use a smaller image for blur benchmarking so that different schedules have different perf

* Replace Interval with ConstantInterval for is_monotonic.

* Don't try to handle unsigned deltas.

* Add failing test.

* Remove unused new code.

* Remove weird debugging code.

* Avoid expanding bounds of split producers

* Remove stray likely_if_innermost.

* Remove old autotune tests.

* Update test for guarded producers.

* Reenable test.

* Update trace for guarding producers.

* Don't overwrite required.used

* Handle LE/LT in bounds of lanes in vectorize

* Fix acquire and release of warmups

* Earlier fix for multiply cloned acquires was wrong.

* Handle nested vectorization.

* clang-format

* Remove autotune_bug_* tests

* Fix shadowing error on some compilers.

* Appease overzealous clang-tidy warning.

* clang-format

* Don't use silly hack.

* clang-tidy...

* It's no longer safe to assume monotonic means bounds_of_expr_in_scope is exact

* Address review comments

* Add comment

* Add missing override.

* Fix constant interval issues.

* Revert and remove empty interval

* Fix multiply!?

* Reduce need for simplifications.

* Simplifications from dsharletg/sliding-window branch

* Don't learn likely(x) and x.

* Add comment

* Add some min/max rules.

* Also substitute facts from asserts

* Remove is_empty from header too.

* More rules

* Add double stairstep rule.

* Disable rule that uncovers bugs.

* Consider anded expressions as if they were independent nested ifs.

* Add promise_clamped to producer guards.

* Revert "Consider anded expressions as if they were independent nested ifs."

This reverts commit 03efb3f784b3078b64961c98edde383f4de04fb4.

* Don't combine ifs, split them instead.

* Update trace

* clang-tidy/clang-format

* Remove splitting of ifs, it breaks brittle tests.

* Safer check on old conditions.

* Fix producer guard condition.

* Interval fixes.

* Handle sliding backwards

* Handle transitive dependencies.

* Backport abadams' fix from abadams/slide_over_split_loop

* Fix select visitor.

* More simplifier rules.

* Bring back old logic as a fallback.

* Avoid specializations corrupting sliding

* Fix boneheaded rule errors.

* Fix slightly conservative bounds at the max for split case.

* This pattern is too sensitive to the simplifier. In a real use case, it's just a sum, and the result can be subtracted after doing a reduction.

* Add missing clamp rule

* Don't count unlikely loops as inner loops for likely_if_innermost

* Use <= instead of == to solve for the new loop min

Useful when the warmup is a partial vector or something

* Verify simplifier changes and add variants as suggested by synthesizer

* Make implicit assumption explicit, for clarity

* Use find_constant_bounds

* Guard against expanded bounds more effectively.

* Update tracing test

* Small cleanup.

* Don't simplify/prove using lets that might change value.

* Stronger solving without expanding lets.

* New simplifier rule for alignment

* Fix case where no warmup needed

* Add some useful rules.

* Add safety check on when we can use the new loop min.

* Better proof to avoid hacky condition that is hard to prove.

* Small cleanup and use the nice new folding factors.

* Bring back unrolled producer test.

* clang-format

* Expand comment.

* Fix sliding backwards condition.

* min(new_loop_min, loop_min) isn't needed any more.

* We need that min, but we can be more conservative about it.

* Stronger handling of previous loop mins.

* Remove unused is_monotonic_strong.

* Remove ConstantInterval::make_intersection.

* Avoid need to handle uint specially.

* Add cache for depends_on.

* Reduce unnecessarily large cache scope

* The first part of the key is always the same

Co-authored-by: Andrew Adams <andrew.b.adams@gmail.com>
1 parent c2a0db1
History
File Mode Size
.github
apps
cmake
dependencies
doc
packaging
python_bindings
src
test
tools
tutorial
util
.clang-format -rw-r--r-- 1.4 KB
.clang-format-ignore -rw-r--r-- 265 bytes
.clang-tidy -rw-r--r-- 1.7 KB
.gitattributes -rw-r--r-- 342 bytes
.gitignore -rw-r--r-- 1.1 KB
.gitmodules -rw-r--r-- 0 bytes
CMakeLists.txt -rw-r--r-- 4.2 KB
CMakePresets.json -rw-r--r-- 2.4 KB
CODE_OF_CONDUCT.md -rw-r--r-- 3.5 KB
LICENSE.txt -rw-r--r-- 3.2 KB
Makefile -rw-r--r-- 99.9 KB
README.md -rw-r--r-- 14.8 KB
README_cmake.md -rw-r--r-- 69.0 KB
README_rungen.md -rw-r--r-- 12.1 KB
README_webassembly.md -rw-r--r-- 7.5 KB
run-clang-format.sh -rwxr-xr-x 1.4 KB
run-clang-tidy.sh -rwxr-xr-x 3.1 KB

README.md

back to top