Revision aeb309b81c6bada783c3695528a3e10748e97285 authored by Huang Ying on 12 July 2019, 03:55:44 UTC, committed by Linus Torvalds on 12 July 2019, 18:05:43 UTC
Via commit 4b3ef9daa4fc ("mm/swap: split swap cache into 64MB trunks"),
after swapoff, the address_space associated with the swap device will be
freed.  So swap_address_space() users which touch the address_space need
some kind of mechanism to prevent the address_space from being freed
during accessing.

When mincore processes an unmapped range for swapped shmem pages, it
doesn't hold the lock to prevent swap device from being swapped off.  So
the following race is possible:

CPU1					CPU2
do_mincore()				swapoff()
  walk_page_range()
    mincore_unmapped_range()
      __mincore_unmapped_range
        mincore_page
	  as = swap_address_space()
          ...				  exit_swap_address_space()
          ...				    kvfree(spaces)
	  find_get_page(as)

The address space may be accessed after being freed.

To fix the race, get_swap_device()/put_swap_device() is used to enclose
find_get_page() to check whether the swap entry is valid and prevent the
swap device from being swapoff during accessing.

Link: http://lkml.kernel.org/r/20190611020510.28251-1-ying.huang@intel.com
Fixes: 4b3ef9daa4fc ("mm/swap: split swap cache into 64MB trunks")
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Andrea Parri <andrea.parri@amarulasolutions.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1 parent 4efaceb
Raw File
highuid.txt
===================================================
Notes on the change from 16-bit UIDs to 32-bit UIDs
===================================================

:Author: Chris Wing <wingc@umich.edu>
:Last updated: January 11, 2000

- kernel code MUST take into account __kernel_uid_t and __kernel_uid32_t
  when communicating between user and kernel space in an ioctl or data
  structure.

- kernel code should use uid_t and gid_t in kernel-private structures and
  code.

What's left to be done for 32-bit UIDs on all Linux architectures:

- Disk quotas have an interesting limitation that is not related to the
  maximum UID/GID. They are limited by the maximum file size on the
  underlying filesystem, because quota records are written at offsets
  corresponding to the UID in question.
  Further investigation is needed to see if the quota system can cope
  properly with huge UIDs. If it can deal with 64-bit file offsets on all 
  architectures, this should not be a problem.

- Decide whether or not to keep backwards compatibility with the system
  accounting file, or if we should break it as the comments suggest
  (currently, the old 16-bit UID and GID are still written to disk, and
  part of the former pad space is used to store separate 32-bit UID and
  GID)

- Need to validate that OS emulation calls the 16-bit UID
  compatibility syscalls, if the OS being emulated used 16-bit UIDs, or
  uses the 32-bit UID system calls properly otherwise.

  This affects at least:

	- iBCS on Intel

	- sparc32 emulation on sparc64
	  (need to support whatever new 32-bit UID system calls are added to
	  sparc32)

- Validate that all filesystems behave properly.

  At present, 32-bit UIDs _should_ work for:

	- ext2
	- ufs
	- isofs
	- nfs
	- coda
	- udf

  Ioctl() fixups have been made for:

	- ncpfs
	- smbfs

  Filesystems with simple fixups to prevent 16-bit UID wraparound:

	- minix
	- sysv
	- qnx4

  Other filesystems have not been checked yet.

- The ncpfs and smpfs filesystems cannot presently use 32-bit UIDs in
  all ioctl()s. Some new ioctl()s have been added with 32-bit UIDs, but
  more are needed. (as well as new user<->kernel data structures)

- The ELF core dump format only supports 16-bit UIDs on arm, i386, m68k,
  sh, and sparc32. Fixing this is probably not that important, but would
  require adding a new ELF section.

- The ioctl()s used to control the in-kernel NFS server only support
  16-bit UIDs on arm, i386, m68k, sh, and sparc32.

- make sure that the UID mapping feature of AX25 networking works properly
  (it should be safe because it's always used a 32-bit integer to
  communicate between user and kernel)
back to top