Revision 5d5fc33ce58e81e8738816f5ee59f8e85fd3b404 authored by Anton Blanchard on 07 June 2024, 06:13:35 UTC, committed by Palmer Dabbelt on 26 July 2024, 12:50:45 UTC
Many CPUs implement return address branch prediction as a stack. The
RISCV architecture refers to this as a return address stack (RAS). If
this gets corrupted then the CPU will mispredict at least one but
potentally many function returns.

There are two issues with the current RISCV exception code:

- We are using the alternate link stack (x5/t0) for the indirect branch
  which makes the hardware think this is a function return. This will
  corrupt the RAS.

- We modify the return address of handle_exception to point to
  ret_from_exception. This will also corrupt the RAS.

Testing the null system call latency before and after the patch:

Visionfive2 (StarFive JH7110 / U74)
baseline: 189.87 ns
patched:  176.76 ns

Lichee pi 4a (T-Head TH1520 / C910)
baseline: 666.58 ns
patched:  636.90 ns

Just over 7% on the U74 and just over 4% on the C910.

Signed-off-by: Anton Blanchard <antonb@tenstorrent.com>
Signed-off-by: Cyril Bur <cyrilbur@tenstorrent.com>
Tested-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>
Link: https://lore.kernel.org/r/20240607061335.2197383-1-cyrilbur@tenstorrent.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
1 parent 8d22d0d
History
File Mode Size
Makefile -rw-r--r-- 515 bytes
advise.c -rw-r--r-- 2.4 KB
advise.h -rw-r--r-- 316 bytes
alloc_cache.h -rw-r--r-- 1.3 KB
cancel.c -rw-r--r-- 8.0 KB
cancel.h -rw-r--r-- 962 bytes
epoll.c -rw-r--r-- 1.3 KB
epoll.h -rw-r--r-- 213 bytes
eventfd.c -rw-r--r-- 3.7 KB
eventfd.h -rw-r--r-- 277 bytes
fdinfo.c -rw-r--r-- 7.1 KB
fdinfo.h -rw-r--r-- 100 bytes
filetable.c -rw-r--r-- 3.9 KB
filetable.h -rw-r--r-- 2.2 KB
fs.c -rw-r--r-- 6.8 KB
fs.h -rw-r--r-- 929 bytes
futex.c -rw-r--r-- 9.3 KB
futex.h -rw-r--r-- 1.2 KB
io-wq.c -rw-r--r-- 32.9 KB
io-wq.h -rw-r--r-- 2.1 KB
io_uring.c -rw-r--r-- 100.0 KB
io_uring.h -rw-r--r-- 12.6 KB
kbuf.c -rw-r--r-- 20.3 KB
kbuf.h -rw-r--r-- 4.7 KB
memmap.c -rw-r--r-- 7.8 KB
memmap.h -rw-r--r-- 921 bytes
msg_ring.c -rw-r--r-- 8.1 KB
msg_ring.h -rw-r--r-- 269 bytes
napi.c -rw-r--r-- 7.8 KB
napi.h -rw-r--r-- 2.4 KB
net.c -rw-r--r-- 44.6 KB
net.h -rw-r--r-- 2.1 KB
nop.c -rw-r--r-- 961 bytes
nop.h -rw-r--r-- 168 bytes
notif.c -rw-r--r-- 3.2 KB
notif.h -rw-r--r-- 1.3 KB
opdef.c -rw-r--r-- 16.1 KB
opdef.h -rw-r--r-- 1.3 KB
openclose.c -rw-r--r-- 7.3 KB
openclose.h -rw-r--r-- 754 bytes
poll.c -rw-r--r-- 28.4 KB
poll.h -rw-r--r-- 1.2 KB
refs.h -rw-r--r-- 1.3 KB
register.c -rw-r--r-- 12.8 KB
register.h -rw-r--r-- 218 bytes
rsrc.c -rw-r--r-- 24.5 KB
rsrc.h -rw-r--r-- 3.6 KB
rw.c -rw-r--r-- 30.3 KB
rw.h -rw-r--r-- 1.2 KB
slist.h -rw-r--r-- 2.7 KB
splice.c -rw-r--r-- 2.9 KB
splice.h -rw-r--r-- 306 bytes
sqpoll.c -rw-r--r-- 12.0 KB
sqpoll.h -rw-r--r-- 848 bytes
statx.c -rw-r--r-- 1.6 KB
statx.h -rw-r--r-- 217 bytes
sync.c -rw-r--r-- 2.8 KB
sync.h -rw-r--r-- 460 bytes
tctx.c -rw-r--r-- 7.4 KB
tctx.h -rw-r--r-- 992 bytes
timeout.c -rw-r--r-- 17.8 KB
timeout.h -rw-r--r-- 1.2 KB
truncate.c -rw-r--r-- 1.0 KB
truncate.h -rw-r--r-- 180 bytes
uring_cmd.c -rw-r--r-- 9.4 KB
uring_cmd.h -rw-r--r-- 347 bytes
waitid.c -rw-r--r-- 9.1 KB
waitid.h -rw-r--r-- 484 bytes
xattr.c -rw-r--r-- 5.5 KB
xattr.h -rw-r--r-- 654 bytes

back to top