Revision 7906d00cd1f687268f0a3599442d113767795ae6 authored by Andrea Arcangeli on 28 July 2008, 22:46:26 UTC, committed by Linus Torvalds on 28 July 2008, 23:30:21 UTC
mm_take_all_locks holds off reclaim from an entire mm_struct.  This allows
mmu notifiers to register into the mm at any time with the guarantee that
no mmu operation is in progress on the mm.

This operation locks against the VM for all pte/vma/mm related operations
that could ever happen on a certain mm.  This includes vmtruncate,
try_to_unmap, and all page faults.

The caller must take the mmap_sem in write mode before calling
mm_take_all_locks().  The caller isn't allowed to release the mmap_sem
until mm_drop_all_locks() returns.

mmap_sem in write mode is required in order to block all operations that
could modify pagetables and free pages without need of altering the vma
layout (for example populate_range() with nonlinear vmas).  It's also
needed in write mode to avoid new anon_vmas to be associated with existing
vmas.

A single task can't take more than one mm_take_all_locks() in a row or it
would deadlock.

mm_take_all_locks() and mm_drop_all_locks are expensive operations that
may have to take thousand of locks.

mm_take_all_locks() can fail if it's interrupted by signals.

When mmu_notifier_register returns, we must be sure that the driver is
notified if some task is in the middle of a vmtruncate for the 'mm' where
the mmu notifier was registered (mmu_notifier_invalidate_range_start/end
is run around the vmtruncation but mmu_notifier_register can run after
mmu_notifier_invalidate_range_start and before
mmu_notifier_invalidate_range_end).  Same problem for rmap paths.  And
we've to remove page pinning to avoid replicating the tlb_gather logic
inside KVM (and GRU doesn't work well with page pinning regardless of
needing tlb_gather), so without mm_take_all_locks when vmtruncate frees
the page, kvm would have no way to notice that it mapped into sptes a page
that is going into the freelist without a chance of any further
mmu_notifier notification.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Andrea Arcangeli <andrea@qumranet.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Robin Holt <holt@sgi.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Kanoj Sarcar <kanojsarcar@yahoo.com>
Cc: Roland Dreier <rdreier@cisco.com>
Cc: Steve Wise <swise@opengridcomputing.com>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Cc: Chris Wright <chrisw@redhat.com>
Cc: Marcelo Tosatti <marcelo@kvack.org>
Cc: Eric Dumazet <dada1@cosmosbay.com>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Izik Eidus <izike@qumranet.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1 parent 6beeac7
Raw File
Kconfig.binfmt
config BINFMT_ELF
	bool "Kernel support for ELF binaries"
	depends on MMU && (BROKEN || !FRV)
	default y
	---help---
	  ELF (Executable and Linkable Format) is a format for libraries and
	  executables used across different architectures and operating
	  systems. Saying Y here will enable your kernel to run ELF binaries
	  and enlarge it by about 13 KB. ELF support under Linux has now all
	  but replaced the traditional Linux a.out formats (QMAGIC and ZMAGIC)
	  because it is portable (this does *not* mean that you will be able
	  to run executables from different architectures or operating systems
	  however) and makes building run-time libraries very easy. Many new
	  executables are distributed solely in ELF format. You definitely
	  want to say Y here.

	  Information about ELF is contained in the ELF HOWTO available from
	  <http://www.tldp.org/docs.html#howto>.

	  If you find that after upgrading from Linux kernel 1.2 and saying Y
	  here, you still can't run any ELF binaries (they just crash), then
	  you'll have to install the newest ELF runtime libraries, including
	  ld.so (check the file <file:Documentation/Changes> for location and
	  latest version).

config COMPAT_BINFMT_ELF
	bool
	depends on COMPAT && MMU

config BINFMT_ELF_FDPIC
	bool "Kernel support for FDPIC ELF binaries"
	default y
	depends on (FRV || BLACKFIN || (SUPERH32 && !MMU))
	help
	  ELF FDPIC binaries are based on ELF, but allow the individual load
	  segments of a binary to be located in memory independently of each
	  other. This makes this format ideal for use in environments where no
	  MMU is available as it still permits text segments to be shared,
	  even if data segments are not.

	  It is also possible to run FDPIC ELF binaries on MMU linux also.

config BINFMT_FLAT
	bool "Kernel support for flat binaries"
	depends on !MMU && (!FRV || BROKEN)
	help
	  Support uClinux FLAT format binaries.

config BINFMT_ZFLAT
	bool "Enable ZFLAT support"
	depends on BINFMT_FLAT
	select ZLIB_INFLATE
	help
	  Support FLAT format compressed binaries

config BINFMT_SHARED_FLAT
	bool "Enable shared FLAT support"
	depends on BINFMT_FLAT
	help
	  Support FLAT shared libraries

config BINFMT_AOUT
	tristate "Kernel support for a.out and ECOFF binaries"
	depends on ARCH_SUPPORTS_AOUT && \
		(X86_32 || ALPHA || ARM || M68K)
	---help---
	  A.out (Assembler.OUTput) is a set of formats for libraries and
	  executables used in the earliest versions of UNIX.  Linux used
	  the a.out formats QMAGIC and ZMAGIC until they were replaced
	  with the ELF format.

	  The conversion to ELF started in 1995.  This option is primarily
	  provided for historical interest and for the benefit of those
	  who need to run binaries from that era.

	  Most people should answer N here.  If you think you may have
	  occasional use for this format, enable module support above
	  and answer M here to compile this support as a module called
	  binfmt_aout.

	  If any crucial components of your system (such as /sbin/init
	  or /lib/ld.so) are still in a.out format, you will have to
	  say Y here.

config OSF4_COMPAT
	bool "OSF/1 v4 readv/writev compatibility"
	depends on ALPHA && BINFMT_AOUT
	help
	  Say Y if you are using OSF/1 binaries (like Netscape and Acrobat)
	  with v4 shared libraries freely available from Compaq. If you're
	  going to use shared libraries from Tru64 version 5.0 or later, say N.

config BINFMT_EM86
	tristate "Kernel support for Linux/Intel ELF binaries"
	depends on ALPHA
	---help---
	  Say Y here if you want to be able to execute Linux/Intel ELF
	  binaries just like native Alpha binaries on your Alpha machine. For
	  this to work, you need to have the emulator /usr/bin/em86 in place.

	  You can get the same functionality by saying N here and saying Y to
	  "Kernel support for MISC binaries".

	  You may answer M to compile the emulation support as a module and
	  later load the module when you want to use a Linux/Intel binary. The
	  module will be called binfmt_em86. If unsure, say Y.

config BINFMT_SOM
	tristate "Kernel support for SOM binaries"
	depends on PARISC && HPUX
	help
	  SOM is a binary executable format inherited from HP/UX.  Say
	  Y here to be able to load and execute SOM binaries directly.

config BINFMT_MISC
	tristate "Kernel support for MISC binaries"
	---help---
	  If you say Y here, it will be possible to plug wrapper-driven binary
	  formats into the kernel. You will like this especially when you use
	  programs that need an interpreter to run like Java, Python, .NET or
	  Emacs-Lisp. It's also useful if you often run DOS executables under
	  the Linux DOS emulator DOSEMU (read the DOSEMU-HOWTO, available from
	  <http://www.tldp.org/docs.html#howto>). Once you have
	  registered such a binary class with the kernel, you can start one of
	  those programs simply by typing in its name at a shell prompt; Linux
	  will automatically feed it to the correct interpreter.

	  You can do other nice things, too. Read the file
	  <file:Documentation/binfmt_misc.txt> to learn how to use this
	  feature, <file:Documentation/java.txt> for information about how
	  to include Java support. and <file:Documentation/mono.txt> for
          information about how to include Mono-based .NET support.

          To use binfmt_misc, you will need to mount it:
		mount binfmt_misc -t binfmt_misc /proc/sys/fs/binfmt_misc

	  You may say M here for module support and later load the module when
	  you have use for it; the module is called binfmt_misc. If you
	  don't know what to answer at this point, say Y.
back to top