Revision 63cae12bce9861cec309798d34701cf3da20bc71 authored by Peter Zijlstra on 09 December 2016, 13:59:00 UTC, committed by Ingo Molnar on 14 January 2017, 09:56:10 UTC
There is problem with installing an event in a task that is 'stuck' on an offline CPU. Blocked tasks are not dis-assosciated from offlined CPUs, after all, a blocked task doesn't run and doesn't require a CPU etc.. Only on wakeup do we ammend the situation and place the task on a available CPU. If we hit such a task with perf_install_in_context() we'll loop until either that task wakes up or the CPU comes back online, if the task waking depends on the event being installed, we're stuck. While looking into this issue, I also spotted another problem, if we hit a task with perf_install_in_context() that is in the middle of being migrated, that is we observe the old CPU before sending the IPI, but run the IPI (on the old CPU) while the task is already running on the new CPU, things also go sideways. Rework things to rely on task_curr() -- outside of rq->lock -- which is rather tricky. Imagine the following scenario where we're trying to install the first event into our task 't': CPU0 CPU1 CPU2 (current == t) t->perf_event_ctxp[] = ctx; smp_mb(); cpu = task_cpu(t); switch(t, n); migrate(t, 2); switch(p, t); ctx = t->perf_event_ctxp[]; // must not be NULL smp_function_call(cpu, ..); generic_exec_single() func(); spin_lock(ctx->lock); if (task_curr(t)) // false add_event_to_ctx(); spin_unlock(ctx->lock); perf_event_context_sched_in(); spin_lock(ctx->lock); // sees event So its CPU0's store of t->perf_event_ctxp[] that must not go 'missing'. Because if CPU2's load of that variable were to observe NULL, it would not try to schedule the ctx and we'd have a task running without its counter, which would be 'bad'. As long as we observe !NULL, we'll acquire ctx->lock. If we acquire it first and not see the event yet, then CPU0 must observe task_curr() and retry. If the install happens first, then we must see the event on sched-in and all is well. I think we can translate the first part (until the 'must not be NULL') of the scenario to a litmus test like: C C-peterz { } P0(int *x, int *y) { int r1; WRITE_ONCE(*x, 1); smp_mb(); r1 = READ_ONCE(*y); } P1(int *y, int *z) { WRITE_ONCE(*y, 1); smp_store_release(z, 1); } P2(int *x, int *z) { int r1; int r2; r1 = smp_load_acquire(z); smp_mb(); r2 = READ_ONCE(*x); } exists (0:r1=0 /\ 2:r1=1 /\ 2:r2=0) Where: x is perf_event_ctxp[], y is our tasks's CPU, and z is our task being placed on the rq of CPU2. The P0 smp_mb() is the one added by this patch, ordering the store to perf_event_ctxp[] from find_get_context() and the load of task_cpu() in task_function_call(). The smp_store_release/smp_load_acquire model the RCpc locking of the rq->lock and the smp_mb() of P2 is the context switch switching from whatever CPU2 was running to our task 't'. This litmus test evaluates into: Test C-peterz Allowed States 7 0:r1=0; 2:r1=0; 2:r2=0; 0:r1=0; 2:r1=0; 2:r2=1; 0:r1=0; 2:r1=1; 2:r2=1; 0:r1=1; 2:r1=0; 2:r2=0; 0:r1=1; 2:r1=0; 2:r2=1; 0:r1=1; 2:r1=1; 2:r2=0; 0:r1=1; 2:r1=1; 2:r2=1; No Witnesses Positive: 0 Negative: 7 Condition exists (0:r1=0 /\ 2:r1=1 /\ 2:r2=0) Observation C-peterz Never 0 7 Hash=e427f41d9146b2a5445101d3e2fcaa34 And the strong and weak model agree. Reported-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Will Deacon <will.deacon@arm.com> Cc: jeremy.linton@arm.com Link: http://lkml.kernel.org/r/20161209135900.GU3174@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
1 parent ad5013d
vg468.h
/*
* vg468.h 1.11 1999/10/25 20:03:34
*
* The contents of this file are subject to the Mozilla Public License
* Version 1.1 (the "License"); you may not use this file except in
* compliance with the License. You may obtain a copy of the License
* at http://www.mozilla.org/MPL/
*
* Software distributed under the License is distributed on an "AS IS"
* basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See
* the License for the specific language governing rights and
* limitations under the License.
*
* The initial developer of the original code is David A. Hinds
* <dahinds@users.sourceforge.net>. Portions created by David A. Hinds
* are Copyright (C) 1999 David A. Hinds. All Rights Reserved.
*
* Alternatively, the contents of this file may be used under the
* terms of the GNU General Public License version 2 (the "GPL"), in which
* case the provisions of the GPL are applicable instead of the
* above. If you wish to allow the use of your version of this file
* only under the terms of the GPL and not to allow others to use
* your version of this file under the MPL, indicate your decision by
* deleting the provisions above and replace them with the notice and
* other provisions required by the GPL. If you do not delete the
* provisions above, a recipient may use your version of this file
* under either the MPL or the GPL.
*/
#ifndef _LINUX_VG468_H
#define _LINUX_VG468_H
/* Special bit in I365_IDENT used for Vadem chip detection */
#define I365_IDENT_VADEM 0x08
/* Special definitions in I365_POWER */
#define VG468_VPP2_MASK 0x0c
#define VG468_VPP2_5V 0x04
#define VG468_VPP2_12V 0x08
/* Unique Vadem registers */
#define VG469_VSENSE 0x1f /* Card voltage sense */
#define VG469_VSELECT 0x2f /* Card voltage select */
#define VG468_CTL 0x38 /* Control register */
#define VG468_TIMER 0x39 /* Timer control */
#define VG468_MISC 0x3a /* Miscellaneous */
#define VG468_GPIO_CFG 0x3b /* GPIO configuration */
#define VG469_EXT_MODE 0x3c /* Extended mode register */
#define VG468_SELECT 0x3d /* Programmable chip select */
#define VG468_SELECT_CFG 0x3e /* Chip select configuration */
#define VG468_ATA 0x3f /* ATA control */
/* Flags for VG469_VSENSE */
#define VG469_VSENSE_A_VS1 0x01
#define VG469_VSENSE_A_VS2 0x02
#define VG469_VSENSE_B_VS1 0x04
#define VG469_VSENSE_B_VS2 0x08
/* Flags for VG469_VSELECT */
#define VG469_VSEL_VCC 0x03
#define VG469_VSEL_5V 0x00
#define VG469_VSEL_3V 0x03
#define VG469_VSEL_MAX 0x0c
#define VG469_VSEL_EXT_STAT 0x10
#define VG469_VSEL_EXT_BUS 0x20
#define VG469_VSEL_MIXED 0x40
#define VG469_VSEL_ISA 0x80
/* Flags for VG468_CTL */
#define VG468_CTL_SLOW 0x01 /* 600ns memory timing */
#define VG468_CTL_ASYNC 0x02 /* Asynchronous bus clocking */
#define VG468_CTL_TSSI 0x08 /* Tri-state some outputs */
#define VG468_CTL_DELAY 0x10 /* Card detect debounce */
#define VG468_CTL_INPACK 0x20 /* Obey INPACK signal? */
#define VG468_CTL_POLARITY 0x40 /* VCCEN polarity */
#define VG468_CTL_COMPAT 0x80 /* Compatibility stuff */
#define VG469_CTL_WS_COMPAT 0x04 /* Wait state compatibility */
#define VG469_CTL_STRETCH 0x10 /* LED stretch */
/* Flags for VG468_TIMER */
#define VG468_TIMER_ZEROPWR 0x10 /* Zero power control */
#define VG468_TIMER_SIGEN 0x20 /* Power up */
#define VG468_TIMER_STATUS 0x40 /* Activity timer status */
#define VG468_TIMER_RES 0x80 /* Timer resolution */
#define VG468_TIMER_MASK 0x0f /* Activity timer timeout */
/* Flags for VG468_MISC */
#define VG468_MISC_GPIO 0x04 /* General-purpose IO */
#define VG468_MISC_DMAWSB 0x08 /* DMA wait state control */
#define VG469_MISC_LEDENA 0x10 /* LED enable */
#define VG468_MISC_VADEMREV 0x40 /* Vadem revision control */
#define VG468_MISC_UNLOCK 0x80 /* Unique register lock */
/* Flags for VG469_EXT_MODE_A */
#define VG469_MODE_VPPST 0x03 /* Vpp steering control */
#define VG469_MODE_INT_SENSE 0x04 /* Internal voltage sense */
#define VG469_MODE_CABLE 0x08
#define VG469_MODE_COMPAT 0x10 /* i82365sl B or DF step */
#define VG469_MODE_TEST 0x20
#define VG469_MODE_RIO 0x40 /* Steer RIO to INTR? */
/* Flags for VG469_EXT_MODE_B */
#define VG469_MODE_B_3V 0x01 /* 3.3v for socket B */
#endif /* _LINUX_VG468_H */
![swh spinner](/static/img/swh-spinner.gif)
Computing file changes ...