Revision 8aef18845266f5c05904c610088f2d1ed58f6be3 authored by Al Viro on 16 June 2011, 14:10:06 UTC, committed by Al Viro on 16 June 2011, 15:28:16 UTC
[Kudos to dhowells for tracking that crap down] If two processes attempt to cause automounting on the same mountpoint at the same time, the vfsmount holding the mountpoint will be left with one too few references on it, causing a BUG when the kernel tries to clean up. The problem is that lock_mount() drops the caller's reference to the mountpoint's vfsmount in the case where it finds something already mounted on the mountpoint as it transits to the mounted filesystem and replaces path->mnt with the new mountpoint vfsmount. During a pathwalk, however, we don't take a reference on the vfsmount if it is the same as the one in the nameidata struct, but do_add_mount() doesn't know this. The fix is to make sure we have a ref on the vfsmount of the mountpoint before calling do_add_mount(). However, if lock_mount() doesn't transit, we're then left with an extra ref on the mountpoint vfsmount which needs releasing. We can handle that in follow_managed() by not making assumptions about what we can and what we cannot get from lookup_mnt() as the current code does. The callers of follow_managed() expect that reference to path->mnt will be grabbed iff path->mnt has been changed. follow_managed() and follow_automount() keep track of whether such reference has been grabbed and assume that it'll happen in those and only those cases that'll have us return with changed path->mnt. That assumption is almost correct - it breaks in case of racing automounts and in even harder to hit race between following a mountpoint and a couple of mount --move. The thing is, we don't need to make that assumption at all - after the end of loop in follow_manage() we can check if path->mnt has ended up unchanged and do mntput() if needed. The BUG can be reproduced with the following test program: #include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <unistd.h> #include <sys/wait.h> int main(int argc, char **argv) { int pid, ws; struct stat buf; pid = fork(); stat(argv[1], &buf); if (pid > 0) wait(&ws); return 0; } and the following procedure: (1) Mount an NFS volume that on the server has something else mounted on a subdirectory. For instance, I can mount / from my server: mount warthog:/ /mnt -t nfs4 -r On the server /data has another filesystem mounted on it, so NFS will see a change in FSID as it walks down the path, and will mark /mnt/data as being a mountpoint. This will cause the automount code to be triggered. !!! Do not look inside the mounted fs at this point !!! (2) Run the above program on a file within the submount to generate two simultaneous automount requests: /tmp/forkstat /mnt/data/testfile (3) Unmount the automounted submount: umount /mnt/data (4) Unmount the original mount: umount /mnt At this point the kernel should throw a BUG with something like the following: BUG: Dentry ffff880032e3c5c0{i=2,n=} still in use (1) [unmount of nfs4 0:12] Note that the bug appears on the root dentry of the original mount, not the mountpoint and not the submount because sys_umount() hasn't got to its final mntput_no_expire() yet, but this isn't so obvious from the call trace: [<ffffffff8117cd82>] shrink_dcache_for_umount+0x69/0x82 [<ffffffff8116160e>] generic_shutdown_super+0x37/0x15b [<ffffffffa00fae56>] ? nfs_super_return_all_delegations+0x2e/0x1b1 [nfs] [<ffffffff811617f3>] kill_anon_super+0x1d/0x7e [<ffffffffa00d0be1>] nfs4_kill_super+0x60/0xb6 [nfs] [<ffffffff81161c17>] deactivate_locked_super+0x34/0x83 [<ffffffff811629ff>] deactivate_super+0x6f/0x7b [<ffffffff81186261>] mntput_no_expire+0x18d/0x199 [<ffffffff811862a8>] mntput+0x3b/0x44 [<ffffffff81186d87>] release_mounts+0xa2/0xbf [<ffffffff811876af>] sys_umount+0x47a/0x4ba [<ffffffff8109e1ca>] ? trace_hardirqs_on_caller+0x1fd/0x22f [<ffffffff816ea86b>] system_call_fastpath+0x16/0x1b as do_umount() is inlined. However, you can see release_mounts() in there. Note also that it may be necessary to have multiple CPU cores to be able to trigger this bug. Tested-by: Jeff Layton <jlayton@redhat.com> Tested-by: Ian Kent <raven@themaw.net> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
1 parent 50338b8
Kconfig
#
# Input device configuration
#
menu "Input device support"
depends on !S390
config INPUT
tristate "Generic input layer (needed for keyboard, mouse, ...)" if EXPERT
default y
help
Say Y here if you have any input device (mouse, keyboard, tablet,
joystick, steering wheel ...) connected to your system and want
it to be available to applications. This includes standard PS/2
keyboard and mouse.
Say N here if you have a headless (no monitor, no keyboard) system.
More information is available: <file:Documentation/input/input.txt>
If unsure, say Y.
To compile this driver as a module, choose M here: the
module will be called input.
if INPUT
config INPUT_FF_MEMLESS
tristate "Support for memoryless force-feedback devices"
help
Say Y here if you have memoryless force-feedback input device
such as Logitech WingMan Force 3D, ThrustMaster FireStorm Dual
Power 2, or similar. You will also need to enable hardware-specific
driver.
If unsure, say N.
To compile this driver as a module, choose M here: the
module will be called ff-memless.
config INPUT_POLLDEV
tristate "Polled input device skeleton"
help
Say Y here if you are using a driver for an input
device that periodically polls hardware state. This
option is only useful for out-of-tree drivers since
in-tree drivers select it automatically.
If unsure, say N.
To compile this driver as a module, choose M here: the
module will be called input-polldev.
config INPUT_SPARSEKMAP
tristate "Sparse keymap support library"
help
Say Y here if you are using a driver for an input
device that uses sparse keymap. This option is only
useful for out-of-tree drivers since in-tree drivers
select it automatically.
If unsure, say N.
To compile this driver as a module, choose M here: the
module will be called sparse-keymap.
comment "Userland interfaces"
config INPUT_MOUSEDEV
tristate "Mouse interface" if EXPERT
default y
help
Say Y here if you want your mouse to be accessible as char devices
13:32+ - /dev/input/mouseX and 13:63 - /dev/input/mice as an
emulated IntelliMouse Explorer PS/2 mouse. That way, all user space
programs (including SVGAlib, GPM and X) will be able to use your
mouse.
If unsure, say Y.
To compile this driver as a module, choose M here: the
module will be called mousedev.
config INPUT_MOUSEDEV_PSAUX
bool "Provide legacy /dev/psaux device"
default y
depends on INPUT_MOUSEDEV
help
Say Y here if you want your mouse also be accessible as char device
10:1 - /dev/psaux. The data available through /dev/psaux is exactly
the same as the data from /dev/input/mice.
If unsure, say Y.
config INPUT_MOUSEDEV_SCREEN_X
int "Horizontal screen resolution"
depends on INPUT_MOUSEDEV
default "1024"
help
If you're using a digitizer, or a graphic tablet, and want to use
it as a mouse then the mousedev driver needs to know the X window
screen resolution you are using to correctly scale the data. If
you're not using a digitizer, this value is ignored.
config INPUT_MOUSEDEV_SCREEN_Y
int "Vertical screen resolution"
depends on INPUT_MOUSEDEV
default "768"
help
If you're using a digitizer, or a graphic tablet, and want to use
it as a mouse then the mousedev driver needs to know the X window
screen resolution you are using to correctly scale the data. If
you're not using a digitizer, this value is ignored.
config INPUT_JOYDEV
tristate "Joystick interface"
help
Say Y here if you want your joystick or gamepad to be
accessible as char device 13:0+ - /dev/input/jsX device.
If unsure, say Y.
More information is available: <file:Documentation/input/joystick.txt>
To compile this driver as a module, choose M here: the
module will be called joydev.
config INPUT_EVDEV
tristate "Event interface"
help
Say Y here if you want your input device events be accessible
under char device 13:64+ - /dev/input/eventX in a generic way.
To compile this driver as a module, choose M here: the
module will be called evdev.
config INPUT_EVBUG
tristate "Event debugging"
help
Say Y here if you have a problem with the input subsystem and
want all events (keypresses, mouse movements), to be output to
the system log. While this is useful for debugging, it's also
a security threat - your keypresses include your passwords, of
course.
If unsure, say N.
To compile this driver as a module, choose M here: the
module will be called evbug.
config INPUT_APMPOWER
tristate "Input Power Event -> APM Bridge" if EXPERT
depends on INPUT && APM_EMULATION
help
Say Y here if you want suspend key events to trigger a user
requested suspend through APM. This is useful on embedded
systems where such behaviour is desired without userspace
interaction. If unsure, say N.
To compile this driver as a module, choose M here: the
module will be called apm-power.
comment "Input Device Drivers"
source "drivers/input/keyboard/Kconfig"
source "drivers/input/mouse/Kconfig"
source "drivers/input/joystick/Kconfig"
source "drivers/input/tablet/Kconfig"
source "drivers/input/touchscreen/Kconfig"
source "drivers/input/misc/Kconfig"
endif
menu "Hardware I/O ports"
source "drivers/input/serio/Kconfig"
source "drivers/input/gameport/Kconfig"
endmenu
endmenu
Computing file changes ...