https://github.com/halide/Halide
Revision 408a27718dd818f9b39ecdae7081dd13167066f8 authored by Steve Suzuki on 28 June 2021, 17:57:29 UTC, committed by GitHub on 28 June 2021, 17:57:29 UTC
* Add definition of Target::ARMFp16

Add the definition of the feature for
ARMv8.2-a half-precision floating point data processing

* Added test to generate 'float16' neon assembly;

* Add check for data type in float16 NEON test

The test simd_op_check doesn't check the suffix of operand
which indicates the data type in case of AArch64 NEON instruction.
  e.g. FADD V0.4S, V0.4S, V0.4S
In order to distinguish instruction of fp16 from fp32,
the suffix such as ".4S" in the above needs to be checked.

* Generate float16 Arm aarch64 LLVM-IR

Armv8-a extension of Half-precision floating point data processing
is supported by CodeGen_ARM.

The target needs to be set as 64-bit with "arm_fp16" feature.
32-bit is not supported in this commit.

Upgrading fp16 to fp32 with emulated conversion is replaced with either
 fp16 native instruction or
 fp32 operation with native type conversion of fp16-fp32

* Fix format and comments for arm_fp16 feature

Co-authored-by: Liam O'Neil <liam.oneil@arm.com>
1 parent bfd9cea
History
Tip revision: 408a27718dd818f9b39ecdae7081dd13167066f8 authored by Steve Suzuki on 28 June 2021, 17:57:29 UTC
Float16 support in CodeGen_ARM (#6102)
Tip revision: 408a277
File Mode Size
.github
apps
cmake
dependencies
doc
packaging
python_bindings
src
test
tools
tutorial
util
.clang-format -rw-r--r-- 1.4 KB
.clang-format-ignore -rw-r--r-- 265 bytes
.clang-tidy -rw-r--r-- 1.7 KB
.gitattributes -rw-r--r-- 342 bytes
.gitignore -rw-r--r-- 1.1 KB
.gitmodules -rw-r--r-- 0 bytes
CMakeLists.txt -rw-r--r-- 5.5 KB
CMakePresets.json -rw-r--r-- 5.2 KB
CODE_OF_CONDUCT.md -rw-r--r-- 3.5 KB
LICENSE.txt -rw-r--r-- 3.2 KB
Makefile -rw-r--r-- 101.2 KB
README.md -rw-r--r-- 16.5 KB
README_cmake.md -rw-r--r-- 69.3 KB
README_rungen.md -rw-r--r-- 12.1 KB
README_webassembly.md -rw-r--r-- 8.6 KB
run-clang-format.sh -rwxr-xr-x 1.4 KB
run-clang-tidy.sh -rwxr-xr-x 3.1 KB

README.md

back to top