Revision db1b1fefc2cecbff2e4214062fa8c680cb6e7b7d authored by Jack Steiner on 31 March 2006, 10:31:21 UTC, committed by Linus Torvalds on 31 March 2006, 20:18:58 UTC
Currently, count_active_tasks() calls both nr_running() &
nr_interruptible().  Each of these functions does a "for_each_cpu" & reads
values from the runqueue of each cpu.  Although this is not a lot of
instructions, each runqueue may be located on different node.  Depending on
the architecture, a unique TLB entry may be required to access each
runqueue.

Since there may be more runqueues than cpu TLB entries, a scan of all
runqueues can trash the TLB.  Each memory reference incurs a TLB miss &
refill.

In addition, the runqueue cacheline that contains nr_running &
nr_uninterruptible may be evicted from the cache between the two passes.
This causes unnecessary cache misses.

Combining nr_running() & nr_interruptible() into a single function
substantially reduces the TLB & cache misses on large systems.  This should
have no measureable effect on smaller systems.

On a 128p IA64 system running a memory stress workload, the new function
reduced the overhead of calc_load() from 605 usec/call to 324 usec/call.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
1 parent 3055add
History
File Mode Size
Documentation
arch
block
crypto
drivers
fs
include
init
ipc
kernel
lib
mm
net
scripts
security
sound
usr
.gitignore -rw-r--r-- 462 bytes
COPYING -rw-r--r-- 18.3 KB
CREDITS -rw-r--r-- 87.7 KB
Kbuild -rw-r--r-- 1.2 KB
MAINTAINERS -rw-r--r-- 67.2 KB
Makefile -rw-r--r-- 43.8 KB
README -rw-r--r-- 15.1 KB
REPORTING-BUGS -rw-r--r-- 3.0 KB

README

back to top