Revision 4df910620bebb5cfe234af16ac8f6474b60215fd authored by Feng Tang on 25 November 2020, 05:22:21 UTC, committed by Linus Torvalds on 26 November 2020, 17:35:49 UTC
0day reported one -22.7% regression for will-it-scale page_fault2 case [1] on a 4 sockets 144 CPU platform, and bisected to it to be caused by Waiman's optimization (commit bd0b230fe1) of saving one 'struct page_counter' space for 'struct mem_cgroup'. Initially we thought it was due to the cache alignment change introduced by the patch, but further debug shows that it is due to some hot data members ('vmstats_local', 'vmstats_percpu', 'vmstats') sit in 2 adjacent cacheline (2N and 2N+1 cacheline), and when adjacent cache line prefetch is enabled, it triggers an "extended level" of cache false sharing for 2 adjacent cache lines. So exchange the 2 member blocks, while keeping mostly the original cache alignment, which can restore and even enhance the performance, and save 64 bytes of space for 'struct mem_cgroup' (from 2880 to 2816, with 0day's default RHEL-8.3 kernel config) [1]. https://lore.kernel.org/lkml/20201102091543.GM31092@shao2-debian/ Fixes: bd0b230fe145 ("mm/memcg: unify swap and memsw page counters") Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Feng Tang <feng.tang@intel.com> Acked-by: Waiman Long <longman@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1 parent fa02fcd
File | Mode | Size |
---|---|---|
6lowpan | ||
802 | ||
8021q | ||
9p | ||
appletalk | ||
atm | ||
ax25 | ||
batman-adv | ||
bluetooth | ||
bpf | ||
bpfilter | ||
bridge | ||
caif | ||
can | ||
ceph | ||
core | ||
dcb | ||
dccp | ||
decnet | ||
dns_resolver | ||
dsa | ||
ethernet | ||
ethtool | ||
hsr | ||
ieee802154 | ||
ife | ||
ipv4 | ||
ipv6 | ||
iucv | ||
kcm | ||
key | ||
l2tp | ||
l3mdev | ||
lapb | ||
llc | ||
mac80211 | ||
mac802154 | ||
mpls | ||
mptcp | ||
ncsi | ||
netfilter | ||
netlabel | ||
netlink | ||
netrom | ||
nfc | ||
nsh | ||
openvswitch | ||
packet | ||
phonet | ||
psample | ||
qrtr | ||
rds | ||
rfkill | ||
rose | ||
rxrpc | ||
sched | ||
sctp | ||
smc | ||
strparser | ||
sunrpc | ||
switchdev | ||
tipc | ||
tls | ||
unix | ||
vmw_vsock | ||
wimax | ||
wireless | ||
x25 | ||
xdp | ||
xfrm | ||
Kconfig | -rw-r--r-- | 14.0 KB |
Makefile | -rw-r--r-- | 2.6 KB |
compat.c | -rw-r--r-- | 14.2 KB |
devres.c | -rw-r--r-- | 2.2 KB |
socket.c | -rw-r--r-- | 91.9 KB |
sysctl_net.c | -rw-r--r-- | 3.0 KB |
Computing file changes ...