Revision 3ad33b2436b545cbe8b28e53f3710432cad457ab authored by Lee Schermerhorn on 15 November 2007, 00:59:10 UTC, committed by Linus Torvalds on 15 November 2007, 02:45:38 UTC
We hit the BUG_ON() in mm/rmap.c:vma_address() when trying to migrate via mbind(MPOL_MF_MOVE) a non-anon region that spans multiple vmas. For anon-regions, we just fail to migrate any pages beyond the 1st vma in the range. This occurs because do_mbind() collects a list of pages to migrate by calling check_range(). check_range() walks the task's mm, spanning vmas as necessary, to collect the migratable pages into a list. Then, do_mbind() calls migrate_pages() passing the list of pages, a function to allocate new pages based on vma policy [new_vma_page()], and a pointer to the first vma of the range. For each page in the list, new_vma_page() calls page_address_in_vma() passing the page and the vma [first in range] to obtain the address to get for alloc_page_vma(). The page address is needed to get interleaving policy correct. If the pages in the list come from multiple vmas, eventually, new_page_address() will pass that page to page_address_in_vma() with the incorrect vma. For !PageAnon pages, this will result in a bug check in rmap.c:vma_address(). For anon pages, vma_address() will just return EFAULT and fail the migration. This patch modifies new_vma_page() to check the return value from page_address_in_vma(). If the return value is EFAULT, new_vma_page() searchs forward via vm_next for the vma that maps the page--i.e., that does not return EFAULT. This assumes that the pages in the list handed to migrate_pages() is in address order. This is currently case. The patch documents this assumption in a new comment block for new_vma_page(). If new_vma_page() cannot locate the vma mapping the page in a forward search in the mm, it will pass a NULL vma to alloc_page_vma(). This will result in the allocation using the task policy, if any, else system default policy. This situation is unlikely, but the patch documents this behavior with a comment. Note, this patch results in restarting from the first vma in a multi-vma range each time new_vma_page() is called. If this is not acceptable, we can make the vma argument a pointer, both in new_vma_page() and it's caller unmap_and_move() so that the value held by the loop in migrate_pages() always passes down the last vma in which a page was found. This will require changes to all new_page_t functions passed to migrate_pages(). Is this necessary? For this patch to work, we can't bug check in vma_address() for pages outside the argument vma. This patch removes the BUG_ON(). All other callers [besides new_vma_page()] already check the return status. Tested on x86_64, 4 node NUMA platform. Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Acked-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1 parent e1a1c99
File | Mode | Size |
---|---|---|
8xx_immap.h | -rw-r--r-- | 13.8 KB |
amigayle.h | -rw-r--r-- | 31 bytes |
amipcmcia.h | -rw-r--r-- | 32 bytes |
bootinfo.h | -rw-r--r-- | 1.1 KB |
bootx.h | -rw-r--r-- | 4.5 KB |
btext.h | -rw-r--r-- | 905 bytes |
commproc.h | -rw-r--r-- | 25.4 KB |
cpm2.h | -rw-r--r-- | 52.7 KB |
delay.h | -rw-r--r-- | 1.9 KB |
device.h | -rw-r--r-- | 129 bytes |
floppy.h | -rw-r--r-- | 4.0 KB |
fs_pd.h | -rw-r--r-- | 752 bytes |
gg2.h | -rw-r--r-- | 2.4 KB |
gt64260.h | -rw-r--r-- | 9.7 KB |
gt64260_defs.h | -rw-r--r-- | 37.1 KB |
harrier.h | -rw-r--r-- | 1.2 KB |
hawk.h | -rw-r--r-- | 1011 bytes |
hawk_defs.h | -rw-r--r-- | 2.2 KB |
highmem.h | -rw-r--r-- | 3.3 KB |
hydra.h | -rw-r--r-- | 2.9 KB |
ibm403.h | -rw-r--r-- | 17.1 KB |
ibm405.h | -rw-r--r-- | 11.8 KB |
ibm44x.h | -rw-r--r-- | 23.4 KB |
ibm4xx.h | -rw-r--r-- | 2.3 KB |
ibm_ocp.h | -rw-r--r-- | 6.9 KB |
ibm_ocp_pci.h | -rw-r--r-- | 627 bytes |
immap_85xx.h | -rw-r--r-- | 5.5 KB |
immap_cpm2.h | -rw-r--r-- | 10.5 KB |
io.h | -rw-r--r-- | 13.8 KB |
irq_regs.h | -rw-r--r-- | 34 bytes |
kdebug.h | -rw-r--r-- | 32 bytes |
kgdb.h | -rw-r--r-- | 1.7 KB |
m8260_pci.h | -rw-r--r-- | 5.9 KB |
machdep.h | -rw-r--r-- | 4.9 KB |
md.h | -rw-r--r-- | 246 bytes |
mk48t59.h | -rw-r--r-- | 658 bytes |
mmu.h | -rw-r--r-- | 15.1 KB |
mmu_context.h | -rw-r--r-- | 5.7 KB |
mpc10x.h | -rw-r--r-- | 6.8 KB |
mpc52xx.h | -rw-r--r-- | 13.9 KB |
mpc52xx_psc.h | -rw-r--r-- | 5.5 KB |
mpc8260.h | -rw-r--r-- | 1.9 KB |
mpc8260_pci9.h | -rw-r--r-- | 1.4 KB |
mpc83xx.h | -rw-r--r-- | 3.7 KB |
mpc85xx.h | -rw-r--r-- | 6.6 KB |
mpc8xx.h | -rw-r--r-- | 2.5 KB |
mv64x60.h | -rw-r--r-- | 11.5 KB |
mv64x60_defs.h | -rw-r--r-- | 33.9 KB |
ocp.h | -rw-r--r-- | 6.7 KB |
ocp_ids.h | -rw-r--r-- | 1.8 KB |
open_pic.h | -rw-r--r-- | 2.8 KB |
page.h | -rw-r--r-- | 3.7 KB |
pc_serial.h | -rw-r--r-- | 1.5 KB |
pci-bridge.h | -rw-r--r-- | 4.5 KB |
pci.h | -rw-r--r-- | 4.3 KB |
pgalloc.h | -rw-r--r-- | 1.3 KB |
pgtable.h | -rw-r--r-- | 29.2 KB |
pnp.h | -rw-r--r-- | 28.0 KB |
ppc4xx_dma.h | -rw-r--r-- | 18.9 KB |
ppc4xx_pic.h | -rw-r--r-- | 1.7 KB |
ppc_sys.h | -rw-r--r-- | 3.3 KB |
ppcboot.h | -rw-r--r-- | 3.7 KB |
prep_nvram.h | -rw-r--r-- | 4.6 KB |
prom.h | -rw-r--r-- | 1.2 KB |
raven.h | -rw-r--r-- | 973 bytes |
reg_booke.h | -rw-r--r-- | 22.8 KB |
residual.h | -rw-r--r-- | 14.8 KB |
rio.h | -rw-r--r-- | 506 bytes |
rtc.h | -rw-r--r-- | 2.2 KB |
serial.h | -rw-r--r-- | 1.1 KB |
smp.h | -rw-r--r-- | 1.8 KB |
spinlock.h | -rw-r--r-- | 3.2 KB |
suspend.h | -rw-r--r-- | 165 bytes |
system.h | -rw-r--r-- | 6.8 KB |
time.h | -rw-r--r-- | 3.8 KB |
todc.h | -rw-r--r-- | 19.4 KB |
traps.h | -rw-r--r-- | 28 bytes |
zorro.h | -rw-r--r-- | 860 bytes |
![swh spinner](/static/img/swh-spinner.gif)
Computing file changes ...