Revision 4df910620bebb5cfe234af16ac8f6474b60215fd authored by Feng Tang on 25 November 2020, 05:22:21 UTC, committed by Linus Torvalds on 26 November 2020, 17:35:49 UTC
0day reported one -22.7% regression for will-it-scale page_fault2
case [1] on a 4 sockets 144 CPU platform, and bisected to it to be
caused by Waiman's optimization (commit bd0b230fe1) of saving one
'struct page_counter' space for 'struct mem_cgroup'.

Initially we thought it was due to the cache alignment change introduced
by the patch, but further debug shows that it is due to some hot data
members ('vmstats_local', 'vmstats_percpu', 'vmstats') sit in 2 adjacent
cacheline (2N and 2N+1 cacheline), and when adjacent cache line prefetch
is enabled, it triggers an "extended level" of cache false sharing for
2 adjacent cache lines.

So exchange the 2 member blocks, while keeping mostly the original
cache alignment, which can restore and even enhance the performance,
and save 64 bytes of space for 'struct mem_cgroup' (from 2880 to 2816,
with 0day's default RHEL-8.3 kernel config)

[1]. https://lore.kernel.org/lkml/20201102091543.GM31092@shao2-debian/

Fixes: bd0b230fe145 ("mm/memcg: unify swap and memsw page counters")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Feng Tang <feng.tang@intel.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1 parent fa02fcd
History
File Mode Size
Kconfig -rw-r--r-- 2.7 KB
Makefile -rw-r--r-- 457 bytes
addr.c -rw-r--r-- 3.9 KB
addr.h -rw-r--r-- 654 bytes
atm_misc.c -rw-r--r-- 2.6 KB
atm_sysfs.c -rw-r--r-- 4.1 KB
br2684.c -rw-r--r-- 22.8 KB
clip.c -rw-r--r-- 22.0 KB
common.c -rw-r--r-- 21.1 KB
common.h -rw-r--r-- 1.6 KB
ioctl.c -rw-r--r-- 9.2 KB
lec.c -rw-r--r-- 58.2 KB
lec.h -rw-r--r-- 5.1 KB
lec_arpc.h -rw-r--r-- 3.0 KB
mpc.c -rw-r--r-- 38.2 KB
mpc.h -rw-r--r-- 1.9 KB
mpoa_caches.c -rw-r--r-- 14.2 KB
mpoa_caches.h -rw-r--r-- 3.1 KB
mpoa_proc.c -rw-r--r-- 7.2 KB
pppoatm.c -rw-r--r-- 15.2 KB
proc.c -rw-r--r-- 9.6 KB
protocols.h -rw-r--r-- 418 bytes
pvc.c -rw-r--r-- 3.8 KB
raw.c -rw-r--r-- 1.9 KB
resources.c -rw-r--r-- 8.9 KB
resources.h -rw-r--r-- 1.1 KB
signaling.c -rw-r--r-- 6.2 KB
signaling.h -rw-r--r-- 877 bytes
svc.c -rw-r--r-- 16.0 KB

back to top