https://github.com/torvalds/linux
Revision ecf7d01c229d11a44609c0067889372c91fb4f36 authored by Peter Zijlstra on 07 October 2015, 12:14:13 UTC, committed by Ingo Molnar on 04 December 2015, 09:26:43 UTC
Oleg noticed that its possible to falsely observe p->on_cpu == 0 such
that we'll prematurely continue with the wakeup and effectively run p on
two CPUs at the same time.

Even though the overlap is very limited; the task is in the middle of
being scheduled out; it could still result in corruption of the
scheduler data structures.

        CPU0                            CPU1

        set_current_state(...)

        <preempt_schedule>
          context_switch(X, Y)
            prepare_lock_switch(Y)
              Y->on_cpu = 1;
            finish_lock_switch(X)
              store_release(X->on_cpu, 0);

                                        try_to_wake_up(X)
                                          LOCK(p->pi_lock);

                                          t = X->on_cpu; // 0

          context_switch(Y, X)
            prepare_lock_switch(X)
              X->on_cpu = 1;
            finish_lock_switch(Y)
              store_release(Y->on_cpu, 0);
        </preempt_schedule>

        schedule();
          deactivate_task(X);
          X->on_rq = 0;

                                          if (X->on_rq) // false

                                          if (t) while (X->on_cpu)
                                            cpu_relax();

          context_switch(X, ..)
            finish_lock_switch(X)
              store_release(X->on_cpu, 0);

Avoid the load of X->on_cpu being hoisted over the X->on_rq load.

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
1 parent b75a225
History
Tip revision: ecf7d01c229d11a44609c0067889372c91fb4f36 authored by Peter Zijlstra on 07 October 2015, 12:14:13 UTC
sched/core: Fix an SMP ordering race in try_to_wake_up() vs. schedule()
Tip revision: ecf7d01
File Mode Size
Documentation
arch
block
certs
crypto
drivers
firmware
fs
include
init
ipc
kernel
lib
mm
net
samples
scripts
security
sound
tools
usr
virt
.get_maintainer.ignore -rw-r--r-- 31 bytes
.gitignore -rw-r--r-- 1.2 KB
.mailmap -rw-r--r-- 5.4 KB
COPYING -rw-r--r-- 18.3 KB
CREDITS -rw-r--r-- 94.9 KB
Kbuild -rw-r--r-- 2.6 KB
Kconfig -rw-r--r-- 252 bytes
MAINTAINERS -rw-r--r-- 328.3 KB
Makefile -rw-r--r-- 53.5 KB
README -rw-r--r-- 18.2 KB
REPORTING-BUGS -rw-r--r-- 7.3 KB

README

back to top