Revision 4837fe37adff1d159904f0c013471b1ecbcb455e authored by Michal Hocko on 14 December 2017, 23:33:15 UTC, committed by Linus Torvalds on 15 December 2017, 00:00:49 UTC
David Rientjes has reported the following memory corruption while the
oom reaper tries to unmap the victims address space

  BUG: Bad page map in process oom_reaper  pte:6353826300000000 pmd:00000000
  addr:00007f50cab1d000 vm_flags:08100073 anon_vma:ffff9eea335603f0 mapping:          (null) index:7f50cab1d
  file:          (null) fault:          (null) mmap:          (null) readpage:          (null)
  CPU: 2 PID: 1001 Comm: oom_reaper
  Call Trace:
     unmap_page_range+0x1068/0x1130
     __oom_reap_task_mm+0xd5/0x16b
     oom_reaper+0xff/0x14c
     kthread+0xc1/0xe0

Tetsuo Handa has noticed that the synchronization inside exit_mmap is
insufficient.  We only synchronize with the oom reaper if
tsk_is_oom_victim which is not true if the final __mmput is called from
a different context than the oom victim exit path.  This can trivially
happen from context of any task which has grabbed mm reference (e.g.  to
read /proc/<pid>/ file which requires mm etc.).

The race would look like this

  oom_reaper		oom_victim		task
						mmget_not_zero
			do_exit
			  mmput
  __oom_reap_task_mm				mmput
  						  __mmput
						    exit_mmap
						      remove_vma
    unmap_page_range

Fix this issue by providing a new mm_is_oom_victim() helper which
operates on the mm struct rather than a task.  Any context which
operates on a remote mm struct should use this helper in place of
tsk_is_oom_victim.  The flag is set in mark_oom_victim and never cleared
so it is stable in the exit_mmap path.

Debugged by Tetsuo Handa.

Link: http://lkml.kernel.org/r/20171210095130.17110-1-mhocko@kernel.org
Fixes: 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: David Rientjes <rientjes@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Andrea Argangeli <andrea@kernel.org>
Cc: <stable@vger.kernel.org>	[4.14]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1 parent bdcf0a4
Raw File
install.sh
#!/bin/sh
#
# This file is subject to the terms and conditions of the GNU General Public
# License.  See the file "COPYING" in the main directory of this archive
# for more details.
#
# Copyright (C) 1995 by Linus Torvalds
#
# Adapted from code in arch/i386/boot/Makefile by H. Peter Anvin
#
# "make install" script for m68k architecture
#
# Arguments:
#   $1 - kernel version
#   $2 - kernel image file
#   $3 - kernel map file
#   $4 - default install path (blank if root directory)
#

verify () {
	if [ ! -f "$1" ]; then
		echo ""                                                   1>&2
		echo " *** Missing file: $1"                              1>&2
		echo ' *** You need to run "make" before "make install".' 1>&2
		echo ""                                                   1>&2
		exit 1
	fi
}

# Make sure the files actually exist
verify "$2"
verify "$3"

# User may have a custom install script

if [ -x ~/bin/${INSTALLKERNEL} ]; then exec ~/bin/${INSTALLKERNEL} "$@"; fi
if [ -x /sbin/${INSTALLKERNEL} ]; then exec /sbin/${INSTALLKERNEL} "$@"; fi

# Default install - same as make zlilo

if [ -f $4/vmlinuz ]; then
	mv $4/vmlinuz $4/vmlinuz.old
fi

if [ -f $4/System.map ]; then
	mv $4/System.map $4/System.old
fi

cat $2 > $4/vmlinuz
cp $3 $4/System.map

sync
back to top