https://github.com/halide/Halide
Revision d80bb234b2fee34a37967e59e4e8565a441969b9 authored by Andrew Adams on 19 October 2021, 16:43:07 UTC, committed by GitHub on 19 October 2021, 16:43:07 UTC
* Add a new unsigned division method

It uses averages rounding up instead of averages rounding down, to
reduce instruction count on x86.

Division by 7 before:
	vpmulhuw	.LCPI0_1(%rip), %ymm0, %ymm1
	vpsubw	%ymm1, %ymm0, %ymm0
	vpsrlw	$1, %ymm0, %ymm0
	vpaddw	%ymm1, %ymm0, %ymm0
	vpsrlw	$2, %ymm0, %ymm0

Division by 7 after:
        vpmulhuw        .LCPI0_1(%rip), %ymm0, %ymm1
        vpavgw  %ymm0, %ymm1, %ymm0
        vpsrlw  $2, %ymm0, %ymm0

* Remove debugging code

* Add comment elaborating on why this is a good idea
1 parent deeb6bc
History
Tip revision: d80bb234b2fee34a37967e59e4e8565a441969b9 authored by Andrew Adams on 19 October 2021, 16:43:07 UTC
Add a new unsigned division method (#6322)
Tip revision: d80bb23
File Mode Size
.github
apps
cmake
dependencies
doc
packaging
python_bindings
src
test
tools
tutorial
util
.clang-format -rw-r--r-- 1.4 KB
.clang-format-ignore -rw-r--r-- 265 bytes
.clang-tidy -rw-r--r-- 1.8 KB
.gitattributes -rw-r--r-- 342 bytes
.gitignore -rw-r--r-- 1.1 KB
.gitmodules -rw-r--r-- 0 bytes
CMakeLists.txt -rw-r--r-- 5.5 KB
CMakePresets.json -rw-r--r-- 5.2 KB
CODE_OF_CONDUCT.md -rw-r--r-- 3.5 KB
LICENSE.txt -rw-r--r-- 3.2 KB
Makefile -rw-r--r-- 101.0 KB
README.md -rw-r--r-- 16.5 KB
README_cmake.md -rw-r--r-- 69.3 KB
README_rungen.md -rw-r--r-- 12.1 KB
README_webassembly.md -rw-r--r-- 8.6 KB
run-clang-format.sh -rwxr-xr-x 1.4 KB
run-clang-tidy.sh -rwxr-xr-x 3.2 KB

README.md

back to top