Revision d80e731ecab420ddcb79ee9d0ac427acbc187b4b authored by Oleg Nesterov on 24 February 2012, 19:07:11 UTC, committed by Linus Torvalds on 24 February 2012, 19:42:50 UTC
This patch is intentionally incomplete to simplify the review.
It ignores ep_unregister_pollwait() which plays with the same wqh.
See the next change.

epoll assumes that the EPOLL_CTL_ADD'ed file controls everything
f_op->poll() needs. In particular it assumes that the wait queue
can't go away until eventpoll_release(). This is not true in case
of signalfd, the task which does EPOLL_CTL_ADD uses its ->sighand
which is not connected to the file.

This patch adds the special event, POLLFREE, currently only for
epoll. It expects that init_poll_funcptr()'ed hook should do the
necessary cleanup. Perhaps it should be defined as EPOLLFREE in
eventpoll.

__cleanup_sighand() is changed to do wake_up_poll(POLLFREE) if
->signalfd_wqh is not empty, we add the new signalfd_cleanup()
helper.

ep_poll_callback(POLLFREE) simply does list_del_init(task_list).
This make this poll entry inconsistent, but we don't care. If you
share epoll fd which contains our sigfd with another process you
should blame yourself. signalfd is "really special". I simply do
not know how we can define the "right" semantics if it used with
epoll.

The main problem is, epoll calls signalfd_poll() once to establish
the connection with the wait queue, after that signalfd_poll(NULL)
returns the different/inconsistent results depending on who does
EPOLL_CTL_MOD/signalfd_read/etc. IOW: apart from sigmask, signalfd
has nothing to do with the file, it works with the current thread.

In short: this patch is the hack which tries to fix the symptoms.
It also assumes that nobody can take tasklist_lock under epoll
locks, this seems to be true.

Note:

	- we do not have wake_up_all_poll() but wake_up_poll()
	  is fine, poll/epoll doesn't use WQ_FLAG_EXCLUSIVE.

	- signalfd_cleanup() uses POLLHUP along with POLLFREE,
	  we need a couple of simple changes in eventpoll.c to
	  make sure it can't be "lost".

Reported-by: Maxime Bizon <mbizon@freebox.fr>
Cc: <stable@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1 parent 855a85f
History
File Mode Size
Kconfig -rw-r--r-- 12.4 KB
Kconfig.debug -rw-r--r-- 1015 bytes
Makefile -rw-r--r-- 1.8 KB
backing-dev.c -rw-r--r-- 22.3 KB
bootmem.c -rw-r--r-- 20.8 KB
bounce.c -rw-r--r-- 6.5 KB
cleancache.c -rw-r--r-- 6.9 KB
compaction.c -rw-r--r-- 20.5 KB
debug-pagealloc.c -rw-r--r-- 2.1 KB
dmapool.c -rw-r--r-- 13.0 KB
fadvise.c -rw-r--r-- 3.6 KB
failslab.c -rw-r--r-- 1.3 KB
filemap.c -rw-r--r-- 69.5 KB
filemap_xip.c -rw-r--r-- 11.2 KB
fremap.c -rw-r--r-- 6.7 KB
highmem.c -rw-r--r-- 10.3 KB
huge_memory.c -rw-r--r-- 63.6 KB
hugetlb.c -rw-r--r-- 76.9 KB
hwpoison-inject.c -rw-r--r-- 3.3 KB
init-mm.c -rw-r--r-- 619 bytes
internal.h -rw-r--r-- 8.6 KB
kmemcheck.c -rw-r--r-- 2.8 KB
kmemleak-test.c -rw-r--r-- 3.3 KB
kmemleak.c -rw-r--r-- 52.6 KB
ksm.c -rw-r--r-- 55.1 KB
maccess.c -rw-r--r-- 1.6 KB
madvise.c -rw-r--r-- 11.5 KB
memblock.c -rw-r--r-- 26.6 KB
memcontrol.c -rw-r--r-- 142.4 KB
memory-failure.c -rw-r--r-- 41.7 KB
memory.c -rw-r--r-- 107.6 KB
memory_hotplug.c -rw-r--r-- 23.9 KB
mempolicy.c -rw-r--r-- 64.5 KB
mempool.c -rw-r--r-- 10.4 KB
migrate.c -rw-r--r-- 33.6 KB
mincore.c -rw-r--r-- 7.8 KB
mlock.c -rw-r--r-- 15.7 KB
mm_init.c -rw-r--r-- 3.7 KB
mmap.c -rw-r--r-- 69.5 KB
mmu_context.c -rw-r--r-- 1.4 KB
mmu_notifier.c -rw-r--r-- 9.1 KB
mmzone.c -rw-r--r-- 1.7 KB
mprotect.c -rw-r--r-- 7.9 KB
mremap.c -rw-r--r-- 14.0 KB
msync.c -rw-r--r-- 2.4 KB
nobootmem.c -rw-r--r-- 10.6 KB
nommu.c -rw-r--r-- 51.0 KB
oom_kill.c -rw-r--r-- 22.2 KB
page-writeback.c -rw-r--r-- 67.3 KB
page_alloc.c -rw-r--r-- 155.3 KB
page_cgroup.c -rw-r--r-- 11.8 KB
page_io.c -rw-r--r-- 3.2 KB
page_isolation.c -rw-r--r-- 3.6 KB
pagewalk.c -rw-r--r-- 5.7 KB
percpu-km.c -rw-r--r-- 2.8 KB
percpu-vm.c -rw-r--r-- 13.0 KB
percpu.c -rw-r--r-- 56.6 KB
pgtable-generic.c -rw-r--r-- 3.3 KB
prio_tree.c -rw-r--r-- 6.3 KB
process_vm_access.c -rw-r--r-- 13.3 KB
quicklist.c -rw-r--r-- 2.4 KB
readahead.c -rw-r--r-- 15.1 KB
rmap.c -rw-r--r-- 52.2 KB
shmem.c -rw-r--r-- 64.4 KB
slab.c -rw-r--r-- 119.3 KB
slob.c -rw-r--r-- 17.1 KB
slub.c -rw-r--r-- 128.3 KB
sparse-vmemmap.c -rw-r--r-- 5.9 KB
sparse.c -rw-r--r-- 20.5 KB
swap.c -rw-r--r-- 20.5 KB
swap_state.c -rw-r--r-- 10.7 KB
swapfile.c -rw-r--r-- 65.3 KB
thrash.c -rw-r--r-- 3.9 KB
truncate.c -rw-r--r-- 17.9 KB
util.c -rw-r--r-- 7.3 KB
vmalloc.c -rw-r--r-- 65.7 KB
vmscan.c -rw-r--r-- 101.2 KB
vmstat.c -rw-r--r-- 33.2 KB

back to top