https://github.com/torvalds/linux
Revision 012a45e3f4af68e86d85cce060c6c2fed56498b2 authored by Leon Ma on 30 April 2014, 08:43:10 UTC, committed by Thomas Gleixner on 30 April 2014, 10:34:51 UTC
If a cpu is idle and starts an hrtimer which is not pinned on that same cpu, the nohz code might target the timer to a different cpu. In the case that we switch the cpu base of the timer we already have a sanity check in place, which determines whether the timer is earlier than the current leftmost timer on the target cpu. In that case we enqueue the timer on the current cpu because we cannot reprogram the clock event device on the target. If the timers base is already the target CPU we do not have this sanity check in place so we enqueue the timer as the leftmost timer in the target cpus rb tree, but we cannot reprogram the clock event device on the target cpu. So the timer expires late and subsequently prevents the reprogramming of the target cpu clock event device until the previously programmed event fires or a timer with an earlier expiry time gets enqueued on the target cpu itself. Add the same target check as we have for the switch base case and start the timer on the current cpu if it would become the leftmost timer on the target. [ tglx: Rewrote subject and changelog ] Signed-off-by: Leon Ma <xindong.ma@intel.com> Link: http://lkml.kernel.org/r/1398847391-5994-1-git-send-email-xindong.ma@intel.com Cc: stable@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
1 parent 6c6c0d5
Tip revision: 012a45e3f4af68e86d85cce060c6c2fed56498b2 authored by Leon Ma on 30 April 2014, 08:43:10 UTC
hrtimer: Prevent remote enqueue of leftmost timers
hrtimer: Prevent remote enqueue of leftmost timers
Tip revision: 012a45e
basic_profiling.txt
These instructions are deliberately very basic. If you want something clever,
go read the real docs ;-) Please don't add more stuff, but feel free to
correct my mistakes ;-) (mbligh@aracnet.com)
Thanks to John Levon, Dave Hansen, et al. for help writing this.
<test> is the thing you're trying to measure.
Make sure you have the correct System.map / vmlinux referenced!
It is probably easiest to use "make install" for linux and hack
/sbin/installkernel to copy vmlinux to /boot, in addition to vmlinuz,
config, System.map, which are usually installed by default.
Readprofile
-----------
A recent readprofile command is needed for 2.6, such as found in util-linux
2.12a, which can be downloaded from:
http://www.kernel.org/pub/linux/utils/util-linux/
Most distributions will ship it already.
Add "profile=2" to the kernel command line.
clear readprofile -r
<test>
dump output readprofile -m /boot/System.map > captured_profile
Oprofile
--------
Get the source (see Changes for required version) from
http://oprofile.sourceforge.net/ and add "idle=poll" to the kernel command
line.
Configure with CONFIG_PROFILING=y and CONFIG_OPROFILE=y & reboot on new kernel
./configure --with-kernel-support
make install
For superior results, be sure to enable the local APIC. If opreport sees
a 0Hz CPU, APIC was not on. Be aware that idle=poll may mean a performance
penalty.
One time setup:
opcontrol --setup --vmlinux=/boot/vmlinux
clear opcontrol --reset
start opcontrol --start
<test>
stop opcontrol --stop
dump output opreport > output_file
To only report on the kernel, run opreport -l /boot/vmlinux > output_file
A reset is needed to clear old statistics, which survive a reboot.
Computing file changes ...