Revision fb59e9f1e9786635ea12e12bf6adbb132e10f979 authored by Hugh Dickins on 04 March 2008, 22:29:16 UTC, committed by Linus Torvalds on 05 March 2008, 00:35:15 UTC
While testing force_empty, during an exit_mmap, __mem_cgroup_remove_list
called from mem_cgroup_uncharge_page oopsed on a NULL pointer in the lru list.
 I couldn't see what racing tasks on other cpus were doing, but surmise that
another must have been in mem_cgroup_charge_common on the same page, between
its unlock_page_cgroup and spin_lock_irqsave near done (thanks to that kzalloc
which I'd almost changed to a kmalloc).

Normally such a race cannot happen, the ref_cnt prevents it, the final
uncharge cannot race with the initial charge.  But force_empty buggers the
ref_cnt, that's what it's all about; and thereafter forced pages are
vulnerable to races such as this (just think of a shared page also mapped into
an mm of another mem_cgroup than that just emptied).  And remain vulnerable
until they're freed indefinitely later.

This patch just fixes the oops by moving the unlock_page_cgroups down below
adding to and removing from the list (only possible given the previous patch);
and while we're at it, we might as well make it an invariant that
page->page_cgroup is always set while pc is on lru.

But this behaviour of force_empty seems highly unsatisfactory to me: why have
a ref_cnt if we always have to cope with it being violated (as in the earlier
page migration patch).  We may prefer force_empty to move pages to an orphan
mem_cgroup (could be the root, but better not), from which other cgroups could
recover them; we might need to reverse the locking again; but no time now for
such concerns.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1 parent 9b3c0a0
History
File Mode Size
Kconfig -rw-r--r-- 5.8 KB
Makefile -rw-r--r-- 1.2 KB
allocpercpu.c -rw-r--r-- 3.9 KB
backing-dev.c -rw-r--r-- 2.0 KB
bootmem.c -rw-r--r-- 12.3 KB
bounce.c -rw-r--r-- 6.4 KB
dmapool.c -rw-r--r-- 12.8 KB
fadvise.c -rw-r--r-- 2.9 KB
filemap.c -rw-r--r-- 68.2 KB
filemap_xip.c -rw-r--r-- 10.1 KB
fremap.c -rw-r--r-- 6.1 KB
highmem.c -rw-r--r-- 8.1 KB
hugetlb.c -rw-r--r-- 31.3 KB
internal.h -rw-r--r-- 1.6 KB
madvise.c -rw-r--r-- 9.5 KB
memcontrol.c -rw-r--r-- 27.2 KB
memory.c -rw-r--r-- 72.9 KB
memory_hotplug.c -rw-r--r-- 13.9 KB
mempolicy.c -rw-r--r-- 51.3 KB
mempool.c -rw-r--r-- 9.0 KB
migrate.c -rw-r--r-- 24.3 KB
mincore.c -rw-r--r-- 5.7 KB
mlock.c -rw-r--r-- 5.6 KB
mmap.c -rw-r--r-- 58.2 KB
mmzone.c -rw-r--r-- 750 bytes
mprotect.c -rw-r--r-- 7.4 KB
mremap.c -rw-r--r-- 10.8 KB
msync.c -rw-r--r-- 2.4 KB
nommu.c -rw-r--r-- 34.1 KB
oom_kill.c -rw-r--r-- 14.8 KB
page-writeback.c -rw-r--r-- 34.9 KB
page_alloc.c -rw-r--r-- 124.1 KB
page_io.c -rw-r--r-- 3.4 KB
page_isolation.c -rw-r--r-- 3.4 KB
pagewalk.c -rw-r--r-- 3.3 KB
pdflush.c -rw-r--r-- 6.4 KB
prio_tree.c -rw-r--r-- 6.3 KB
quicklist.c -rw-r--r-- 2.3 KB
readahead.c -rw-r--r-- 13.3 KB
rmap.c -rw-r--r-- 27.4 KB
shmem.c -rw-r--r-- 67.1 KB
shmem_acl.c -rw-r--r-- 4.6 KB
slab.c -rw-r--r-- 115.2 KB
slob.c -rw-r--r-- 15.8 KB
slub.c -rw-r--r-- 101.1 KB
sparse-vmemmap.c -rw-r--r-- 4.2 KB
sparse.c -rw-r--r-- 10.1 KB
swap.c -rw-r--r-- 13.3 KB
swap_state.c -rw-r--r-- 9.7 KB
swapfile.c -rw-r--r-- 45.1 KB
thrash.c -rw-r--r-- 2.0 KB
tiny-shmem.c -rw-r--r-- 2.9 KB
truncate.c -rw-r--r-- 12.9 KB
util.c -rw-r--r-- 2.7 KB
vmalloc.c -rw-r--r-- 20.0 KB
vmscan.c -rw-r--r-- 57.3 KB
vmstat.c -rw-r--r-- 19.7 KB

back to top