Revision a0582f26ec9dfd5360ea2f35dd9a1b026f8adda0 authored by Pavel Tatashin on 31 May 2017, 15:25:24 UTC, committed by David S. Miller on 06 June 2017, 20:45:29 UTC
The current wrap implementation has a race issue: it is called outside of
the ctx_alloc_lock, and also does not wait for all CPUs to complete the
wrap.  This means that a thread can get a new context with a new version
and another thread might still be running with the same context. The
problem is especially severe on CPUs with shared TLBs, like sun4v. I used
the following test to very quickly reproduce the problem:
- start over 8K processes (must be more than context IDs)
- write and read values at a  memory location in every process.

Very quickly memory corruptions start happening, and what we read back
does not equal what we wrote.

Several approaches were explored before settling on this one:

Approach 1:
Move smp_new_mmu_context_version() inside ctx_alloc_lock, and wait for
every process to complete the wrap. (Note: every CPU must WAIT before
leaving smp_new_mmu_context_version_client() until every one arrives).

This approach ends up with deadlocks, as some threads own locks which other
threads are waiting for, and they never receive softint until these threads
exit smp_new_mmu_context_version_client(). Since we do not allow the exit,
deadlock happens.

Approach 2:
Handle wrap right during mondo interrupt. Use etrap/rtrap to enter into
into C code, and issue new versions to every CPU.
This approach adds some overhead to runtime: in switch_mm() we must add
some checks to make sure that versions have not changed due to wrap while
we were loading the new secondary context. (could be protected by PSTATE_IE
but that degrades performance as on M7 and older CPUs as it takes 50 cycles
for each access). Also, we still need a global per-cpu array of MMs to know
where we need to load new contexts, otherwise we can change context to a
thread that is going way (if we received mondo between switch_mm() and
switch_to() time). Finally, there are some issues with window registers in
rtrap() when context IDs are changed during CPU mondo time.

The approach in this patch is the simplest and has almost no impact on
runtime.  We use the array with mm's where last secondary contexts were
loaded onto CPUs and bump their versions to the new generation without
changing context IDs. If a new process comes in to get a context ID, it
will go through get_new_mmu_context() because of version mismatch. But the
running processes do not need to be interrupted. And wrap is quicker as we
do not need to xcall and wait for everyone to receive and complete wrap.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Reviewed-by: Steven Sistare <steven.sistare@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1 parent 7a5b4bb
History
File Mode Size
basic
coccinelle
dtc
gcc-plugins
gdb
genksyms
kconfig
ksymoops
mod
package
selinux
tracing
.gitignore -rw-r--r-- 170 bytes
Kbuild.include -rw-r--r-- 15.1 KB
Lindent -rwxr-xr-x 467 bytes
Makefile -rw-r--r-- 1.8 KB
Makefile.asm-generic -rw-r--r-- 1.1 KB
Makefile.build -rw-r--r-- 18.9 KB
Makefile.clean -rw-r--r-- 2.9 KB
Makefile.dtbinst -rw-r--r-- 1.1 KB
Makefile.extrawarn -rw-r--r-- 2.5 KB
Makefile.fwinst -rw-r--r-- 2.0 KB
Makefile.gcc-plugins -rw-r--r-- 3.2 KB
Makefile.headersinst -rw-r--r-- 4.6 KB
Makefile.help -rw-r--r-- 68 bytes
Makefile.host -rw-r--r-- 6.8 KB
Makefile.kasan -rw-r--r-- 1005 bytes
Makefile.lib -rw-r--r-- 15.6 KB
Makefile.modbuiltin -rw-r--r-- 1.8 KB
Makefile.modinst -rw-r--r-- 1.2 KB
Makefile.modpost -rw-r--r-- 5.5 KB
Makefile.modsign -rw-r--r-- 1005 bytes
Makefile.ubsan -rw-r--r-- 1.0 KB
adjust_autoksyms.sh -rwxr-xr-x 2.8 KB
asn1_compiler.c -rw-r--r-- 35.5 KB
bloat-o-meter -rwxr-xr-x 2.2 KB
bootgraph.pl -rwxr-xr-x 6.3 KB
check-lc_ctype.c -rw-r--r-- 201 bytes
check_00index.sh -rwxr-xr-x 1.3 KB
check_extable.sh -rwxr-xr-x 4.9 KB
checkincludes.pl -rwxr-xr-x 1.9 KB
checkkconfigsymbols.py -rwxr-xr-x 15.5 KB
checkpatch.pl -rwxr-xr-x 185.8 KB
checkstack.pl -rwxr-xr-x 5.4 KB
checksyscalls.sh -rwxr-xr-x 5.6 KB
checkversion.pl -rwxr-xr-x 1.9 KB
cleanfile -rwxr-xr-x 3.4 KB
cleanpatch -rwxr-xr-x 5.0 KB
coccicheck -rwxr-xr-x 7.1 KB
config -rwxr-xr-x 4.5 KB
conmakehash.c -rw-r--r-- 6.0 KB
const_structs.checkpatch -rw-r--r-- 964 bytes
decode_stacktrace.sh -rwxr-xr-x 3.7 KB
decodecode -rwxr-xr-x 2.1 KB
depmod.sh -rwxr-xr-x 1.7 KB
diffconfig -rwxr-xr-x 3.7 KB
docproc.c -rw-r--r-- 15.5 KB
export_report.pl -rwxr-xr-x 4.5 KB
extract-cert.c -rw-r--r-- 3.5 KB
extract-ikconfig -rwxr-xr-x 1.7 KB
extract-module-sig.pl -rwxr-xr-x 3.6 KB
extract-sys-certs.pl -rwxr-xr-x 3.7 KB
extract-vmlinux -rwxr-xr-x 1.6 KB
extract_xc3028.pl -rwxr-xr-x 44.6 KB
faddr2line -rwxr-xr-x 5.1 KB
gcc-goto.sh -rwxr-xr-x 495 bytes
gcc-ld -rwxr-xr-x 676 bytes
gcc-plugin.sh -rwxr-xr-x 1.0 KB
gcc-version.sh -rwxr-xr-x 822 bytes
gcc-x86_32-has-stack-protector.sh -rwxr-xr-x 184 bytes
gcc-x86_64-has-stack-protector.sh -rwxr-xr-x 209 bytes
gen_initramfs_list.sh -rwxr-xr-x 7.9 KB
get_dvb_firmware -rwxr-xr-x 25.2 KB
get_maintainer.pl -rwxr-xr-x 59.1 KB
gfp-translate -rwxr-xr-x 1.7 KB
headerdep.pl -rwxr-xr-x 3.5 KB
headers.sh -rwxr-xr-x 477 bytes
headers_check.pl -rwxr-xr-x 3.7 KB
headers_install.sh -rwxr-xr-x 1.3 KB
insert-sys-cert.c -rw-r--r-- 8.9 KB
kallsyms.c -rw-r--r-- 18.7 KB
kernel-doc -rwxr-xr-x 91.6 KB
kernel-doc-xml-ref -rwxr-xr-x 4.2 KB
ld-version.sh -rwxr-xr-x 234 bytes
link-vmlinux.sh -rwxr-xr-x 7.0 KB
makelst -rwxr-xr-x 773 bytes
markup_oops.pl -rwxr-xr-x 8.1 KB
mkcompile_h -rwxr-xr-x 2.5 KB
mkmakefile -rwxr-xr-x 1.2 KB
mksysmap -rwxr-xr-x 1.3 KB
mkuboot.sh -rwxr-xr-x 379 bytes
mkversion -rw-r--r-- 74 bytes
module-common.lds -rw-r--r-- 901 bytes
namespace.pl -rwxr-xr-x 13.1 KB
objdiff -rwxr-xr-x 2.8 KB
patch-kernel -rwxr-xr-x 9.9 KB
pnmtologo.c -rw-r--r-- 11.9 KB
profile2linkerlist.pl -rwxr-xr-x 375 bytes
prune-kernel -rwxr-xr-x 673 bytes
recordmcount.c -rw-r--r-- 17.2 KB
recordmcount.h -rw-r--r-- 16.4 KB
recordmcount.pl -rwxr-xr-x 18.0 KB
setlocalversion -rwxr-xr-x 3.9 KB
show_delta -rwxr-xr-x 3.0 KB
sign-file.c -rw-r--r-- 9.8 KB
sortextable.c -rw-r--r-- 8.4 KB
sortextable.h -rw-r--r-- 5.5 KB
spelling.txt -rw-r--r-- 23.6 KB
stackdelta -rwxr-xr-x 1.8 KB
stackusage -rwxr-xr-x 759 bytes
tags.sh -rwxr-xr-x 9.5 KB
unifdef.c -rw-r--r-- 34.8 KB
ver_linux -rwxr-xr-x 2.9 KB
xen-hypercalls.sh -rw-r--r-- 351 bytes
xz_wrap.sh -rwxr-xr-x 562 bytes

back to top