Revision 36d74a8cbf9c4129f608cd97d231961f1bd99c4c authored by Andrew Adams on 27 February 2024, 01:56:59 UTC, committed by GitHub on 27 February 2024, 01:56:59 UTC
* Avoid redundant scope lookups

This pattern has been bugging me for a long time:

```
if (scope.contains(key)) {
  Foo f = scope.get(key);
}
```

This redundantly looks up the key in the scope twice. I've finally
gotten around to fixing it. I've introduced a find method that either
returns a const pointer to the value, if it exists, or null. It also
searches any containing scopes, which are held by const pointer, so the
method has to return a const pointer.

```
if (const Foo *f = scope.find(key)) {
}
```

For cases where you want to get and then mutate, I added shallow_find,
which doesn't search enclosing scopes, but returns a mutable pointer.

We were also doing redundant scope lookups in ScopedBinding. We stored
the key in the helper object, and then did a pop on that key in the
ScopedBinding destructor. This commit changes Scope so that Scope::push
returns an opaque token that you can pass to Scope::pop to have it
remove that element without doing a fresh lookup. ScopedBinding now uses
this. Under the hood it's just an iterator on the underlying map (map
iterators are not invalidated on inserting or removing other stuff).

The net effect is to speed up local laplacian lowering by about 5%

I also considered making it look more like an stl class, and having find
return an iterator, but it doesn't really work. The iterator it returns
might point to an entry in an enclosing scope, in which case you can't
compare it to the .end() method of the scope you have. Scopes are
different enough from maps that the interface really needs to be
distinct.

* Pacify clang-tidy

* Fix unintentional mutation of interval in scope

* Fix accidental Scope::get

* Rewrite the skip stages lowering pass

Skip stages was slow due to crappy computational complexity (quadratic?)

I reworked it into a two-pass linear-time algorithm. The first part
remembers which pieces of IR are actually relevant to the task, and the
second pass performs the task using a bounds-inference-like algorithm.

On main resnet50 spends 519 ms in this pass. This commit reduces it to
40 ms. Local laplacian with 100 pyramid levels spends 7.4 seconds in
this pass. This commit reduces it to ~3 ms.

This commit also moves the cache store for memoized Funcs into the
produce node, instead of at the top of the consume node, because it
naturally places it inside a condition you inject into the produce node.

* clang-tidy fixes

* Fix skip stages interaction with compute_with

* Unify let visitors, and use fewer stack frames for them

* Fix accidental leakage of .used into .loaded

* Visit the bodies of uninteresting let chains

* Another used -> loaded

* Fix hoist_storage not handling condition correctly.

---------

Co-authored-by: Steven Johnson <srj@google.com>
1 parent 2b5beb3
Raw File
CMakePresets.json
{
  "version": 3,
  "cmakeMinimumRequired": {
    "major": 3,
    "minor": 22,
    "patch": 0
  },
  "configurePresets": [
    {
      "name": "base",
      "hidden": true,
      "binaryDir": "build/${presetName}",
      "installDir": "install/${presetName}"
    },
    {
      "name": "ci",
      "hidden": true,
      "inherits": "base",
      "toolchainFile": "${sourceDir}/cmake/toolchain.${presetName}.cmake",
      "cacheVariables": {
        "CMAKE_BUILD_TYPE": "RelWithDebInfo"
      }
    },
    {
      "name": "windows-only",
      "hidden": true,
      "condition": {
        "type": "equals",
        "lhs": "${hostSystemName}",
        "rhs": "Windows"
      }
    },
    {
      "name": "vcpkg",
      "hidden": true,
      "toolchainFile": "$env{VCPKG_ROOT}/scripts/buildsystems/vcpkg.cmake"
    },
    {
      "name": "vs2022",
      "hidden": true,
      "inherits": [
        "vcpkg",
        "windows-only"
      ],
      "generator": "Visual Studio 17 2022",
      "toolset": "host=x64"
    },
    {
      "name": "debug",
      "inherits": "base",
      "displayName": "Debug",
      "description": "Debug build with no special settings",
      "cacheVariables": {
        "CMAKE_BUILD_TYPE": "Debug"
      }
    },
    {
      "name": "release",
      "inherits": "base",
      "displayName": "Release",
      "description": "Release build with no special settings",
      "cacheVariables": {
        "CMAKE_BUILD_TYPE": "Release"
      }
    },
    {
      "name": "debian-debug",
      "inherits": "debug",
      "displayName": "Debian (Debug)",
      "description": "Debug build assuming Debian-provided dependencies",
      "cacheVariables": {
        "Halide_SHARED_LLVM": "ON"
      }
    },
    {
      "name": "debian-release",
      "inherits": "debian-debug",
      "displayName": "Debian (Release)",
      "description": "Release build assuming Debian-provided dependencies",
      "cacheVariables": {
        "CMAKE_BUILD_TYPE": "Release"
      }
    },
    {
      "name": "win32",
      "inherits": [
        "vs2022",
        "base"
      ],
      "displayName": "Win32 (Visual Studio)",
      "description": "Visual Studio-based Win32 build with vcpkg dependencies.",
      "architecture": "Win32"
    },
    {
      "name": "win64",
      "inherits": [
        "vs2022",
        "base"
      ],
      "displayName": "Win64 (Visual Studio)",
      "description": "Visual Studio-based x64 build with vcpkg dependencies.",
      "architecture": "x64"
    },
    {
      "name": "package",
      "hidden": true,
      "cacheVariables": {
        "CMAKE_BUILD_TYPE": "Release",
        "LLVM_DIR": "$env{LLVM_DIR}",
        "Clang_DIR": "$env{Clang_DIR}",
        "LLD_DIR": "$env{LLD_DIR}",
        "WITH_TESTS": "NO",
        "WITH_TUTORIALS": "NO",
        "WITH_DOCS": "YES",
        "WITH_UTILS": "YES",
        "WITH_PYTHON_BINDINGS": "NO",
        "CMAKE_INSTALL_DATADIR": "share/Halide"
      }
    },
    {
      "name": "package-windows",
      "inherits": [
        "package",
        "vs2022"
      ],
      "displayName": "Package ZIP for Windows",
      "description": "Build for packaging Windows shared libraries.",
      "binaryDir": "${sourceDir}/build",
      "cacheVariables": {
        "BUILD_SHARED_LIBS": "YES",
        "CMAKE_INSTALL_BINDIR": "bin/$<CONFIG>",
        "CMAKE_INSTALL_LIBDIR": "lib/$<CONFIG>",
        "Halide_INSTALL_CMAKEDIR": "lib/cmake/Halide",
        "Halide_INSTALL_HELPERSDIR": "lib/cmake/HalideHelpers"
      }
    },
    {
      "name": "package-unix-shared",
      "inherits": "package",
      "displayName": "Package UNIX shared libs",
      "description": "Build for packaging UNIX shared libraries.",
      "binaryDir": "shared-Release",
      "cacheVariables": {
        "BUILD_SHARED_LIBS": "YES"
      }
    },
    {
      "name": "package-unix-static",
      "inherits": "package",
      "displayName": "Package UNIX static libs",
      "description": "Build for packaging UNIX static libraries.",
      "binaryDir": "static-Release",
      "cacheVariables": {
        "BUILD_SHARED_LIBS": "NO",
        "Halide_BUNDLE_LLVM": "YES"
      }
    },
    {
      "name": "linux-x64-asan",
      "inherits": "ci",
      "displayName": "ASAN (Linux x64)",
      "description": "Build everything with ASAN enabled",
      "cacheVariables": {
        "LLVM_ROOT": "$penv{LLVM_ROOT}"
      }
    },
    {
      "name": "linux-x64-fuzzer",
      "inherits": "ci",
      "displayName": "Fuzzer (Linux x64)",
      "description": "Build everything with fuzzing enabled",
      "cacheVariables": {
        "LLVM_ROOT": "$penv{LLVM_ROOT}",
        "TARGET_WEBASSEMBLY": "NO",
        "WITH_TUTORIALS": "NO",
        "WITH_UTILS": "NO",
        "WITH_PYTHON_BINDINGS": "NO",
        "WITH_TESTS": "YES",
        "WITH_TEST_AUTO_SCHEDULE": "NO",
        "WITH_TEST_CORRECTNESS": "NO",
        "WITH_TEST_ERROR": "NO",
        "WITH_TEST_WARNING": "NO",
        "WITH_TEST_PERFORMANCE": "NO",
        "WITH_TEST_RUNTIME": "NO",
        "WITH_TEST_GENERATOR": "NO",
        "WITH_TEST_FUZZ": "YES",
        "BUILD_SHARED_LIBS": "NO"
      }
    }
  ],
  "buildPresets": [
    {
      "name": "debug",
      "configurePreset": "debug",
      "displayName": "Debug",
      "description": "Debug build with no special settings"
    },
    {
      "name": "release",
      "configurePreset": "release",
      "displayName": "Release",
      "description": "Release build with no special settings"
    },
    {
      "name": "linux-x64-asan",
      "configurePreset": "linux-x64-asan",
      "displayName": "ASAN (Linux x64)",
      "description": "Build everything with ASAN enabled"
    },
    {
      "name": "linux-x64-fuzzer",
      "configurePreset": "linux-x64-fuzzer",
      "displayName": "Fuzzing (Linux x64)",
      "description": "Build everything with fuzzing enabled"
    }
  ],
  "testPresets": [
    {
      "name": "debug",
      "configurePreset": "debug",
      "displayName": "Debug",
      "description": "Test everything with Debug build",
      "output": {
        "outputOnFailure": true
      }
    },
    {
      "name": "release",
      "configurePreset": "release",
      "displayName": "Release",
      "description": "Test everything with Release build",
      "output": {
        "outputOnFailure": true
      }
    },
    {
      "name": "linux-x64-asan",
      "configurePreset": "linux-x64-asan",
      "displayName": "ASAN (Linux x64)",
      "description": "Test everything with ASAN enabled",
      "environment": {
        "ASAN_OPTIONS": "detect_leaks=0:detect_container_overflow=0"
      },
      "output": {
        "outputOnFailure": true
      }
    },
    {
      "name": "linux-x64-fuzzer",
      "configurePreset": "linux-x64-fuzzer",
      "displayName": "Fuzzing (Linux x64)",
      "description": "Test everything with fuzzing enabled",
      "output": {
        "outputOnFailure": true
      }
    }
  ]
}
back to top