https://github.com/torvalds/linux
Revision f4929bd37208540c2c6f416e9035ff1938f2dbc6 authored by Peter Zijlstra on 22 April 2011, 11:39:56 UTC, committed by Ingo Molnar on 22 April 2011, 11:50:27 UTC
Change the Nehalem cache events to use retired memory instruction counters
(similar to Westmere), this greatly improves the provided stats.

Using:

main ()
{
        int i;

        for (i = 0; i < 1000000000; i++) {
                asm("mov (%%rsp), %%rbx;"
                    "mov %%rbx, (%%rsp);" : : : "rbx");
        }
}

We find:

 $ perf stat --repeat 10 -e instructions:u -e l1-dcache-loads:u -e l1-dcache-stores:u ./loop_1b_loads+stores
  Performance counter stats for './loop_1b_loads+stores' (10 runs):
      4,000,081,056 instructions:u           #      0.000 IPC ( +-   0.000% )
      4,999,502,846 l1-dcache-loads:u          ( +-   0.008% )
      1,000,034,832 l1-dcache-stores:u         ( +-   0.000% )
         1.565184942  seconds time elapsed   ( +-   0.005% )

The 5b is surprising - we'd expect 1b:

 $ perf stat --repeat 10 -e instructions:u -e r10b:u -e l1-dcache-stores:u ./loop_1b_loads+stores
  Performance counter stats for './loop_1b_loads+stores' (10 runs):
      4,000,081,054 instructions:u           #      0.000 IPC ( +-   0.000% )
      1,000,021,961 r10b:u                     ( +-   0.000% )
      1,000,030,951 l1-dcache-stores:u         ( +-   0.000% )
         1.565055422  seconds time elapsed   ( +-   0.003% )

Which this patch thus fixes.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Link: http://lkml.kernel.org/n/tip-q9rtru7b7840tws75xzboapv@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
1 parent 1ea5a6a
History
Tip revision: f4929bd37208540c2c6f416e9035ff1938f2dbc6 authored by Peter Zijlstra on 22 April 2011, 11:39:56 UTC
perf, x86: Update/fix Intel Nehalem cache events
Tip revision: f4929bd
File Mode Size
Documentation
arch
block
crypto
drivers
firmware
fs
include
init
ipc
kernel
lib
mm
net
samples
scripts
security
sound
tools
usr
virt
.gitignore -rw-r--r-- 941 bytes
.mailmap -rw-r--r-- 4.1 KB
COPYING -rw-r--r-- 18.3 KB
CREDITS -rw-r--r-- 91.7 KB
Kbuild -rw-r--r-- 2.4 KB
Kconfig -rw-r--r-- 252 bytes
MAINTAINERS -rw-r--r-- 188.3 KB
Makefile -rw-r--r-- 51.1 KB
README -rw-r--r-- 17.1 KB
REPORTING-BUGS -rw-r--r-- 3.3 KB

README

back to top