Revision efad4e475c312456edb3c789d0996d12ed744c13 authored by Michal Hocko on 01 February 2019, 22:20:34 UTC, committed by Linus Torvalds on 01 February 2019, 23:46:23 UTC
Patch series "mm, memory_hotplug: fix uninitialized pages fallouts", v2.

Mikhail Zaslonko has posted fixes for the two bugs quite some time ago
[1].  I have pushed back on those fixes because I believed that it is
much better to plug the problem at the initialization time rather than
play whack-a-mole all over the hotplug code and find all the places
which expect the full memory section to be initialized.

We have ended up with commit 2830bf6f05fb ("mm, memory_hotplug:
initialize struct pages for the full memory section") merged and cause a
regression [2][3].  The reason is that there might be memory layouts
when two NUMA nodes share the same memory section so the merged fix is
simply incorrect.

In order to plug this hole we really have to be zone range aware in
those handlers.  I have split up the original patch into two.  One is
unchanged (patch 2) and I took a different approach for `removable'
crash.

[1] http://lkml.kernel.org/r/20181105150401.97287-2-zaslonko@linux.ibm.com
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1666948
[3] http://lkml.kernel.org/r/20190125163938.GA20411@dhcp22.suse.cz

This patch (of 2):

Mikhail has reported the following VM_BUG_ON triggered when reading sysfs
removable state of a memory block:

 page:000003d08300c000 is uninitialized and poisoned
 page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
 Call Trace:
   is_mem_section_removable+0xb4/0x190
   show_mem_removable+0x9a/0xd8
   dev_attr_show+0x34/0x70
   sysfs_kf_seq_show+0xc8/0x148
   seq_read+0x204/0x480
   __vfs_read+0x32/0x178
   vfs_read+0x82/0x138
   ksys_read+0x5a/0xb0
   system_call+0xdc/0x2d8
 Last Breaking-Event-Address:
   is_mem_section_removable+0xb4/0x190
 Kernel panic - not syncing: Fatal exception: panic_on_oops

The reason is that the memory block spans the zone boundary and we are
stumbling over an unitialized struct page.  Fix this by enforcing zone
range in is_mem_section_removable so that we never run away from a zone.

Link: http://lkml.kernel.org/r/20190128144506.15603-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Debugged-by: Mikhail Zaslonko <zaslonko@linux.ibm.com>
Tested-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1 parent 9bcdeb5
Raw File
dontdiff
*.a
*.aux
*.bc
*.bin
*.bz2
*.c.[012]*.*
*.cis
*.cpio
*.csp
*.dsp
*.dvi
*.elf
*.eps
*.fw
*.gcno
*.gcov
*.gen.S
*.gif
*.grep
*.grp
*.gz
*.html
*.i
*.jpeg
*.ko
*.ll
*.log
*.lst
*.lzma
*.lzo
*.mo
*.moc
*.mod.c
*.o
*.o.*
*.order
*.orig
*.out
*.patch
*.pdf
*.plist
*.png
*.pot
*.ps
*.rej
*.s
*.sgml
*.so
*.so.dbg
*.symtypes
*.tab.c
*.tab.h
*.tex
*.ver
*.xml
*.xz
*_MODULES
*_vga16.c
*~
\#*#
*.9
.*
.*.d
.mm
53c700_d.h
CVS
ChangeSet
GPATH
GRTAGS
GSYMS
GTAGS
Image
Module.markers
Module.symvers
PENDING
SCCS
System.map*
TAGS
aconf
af_names.h
aic7*reg.h*
aic7*reg_print.c*
aic7*seq.h*
aicasm
aicdb.h*
altivec*.c
asm-offsets.h
asm_offsets.h
autoconf.h*
av_permissions.h
bbootsect
bin2c
binkernel.spec
bootsect
bounds.h
bsetup
btfixupprep
build
bvmlinux
bzImage*
capability_names.h
capflags.c
classlist.h*
comp*.log
compile.h*
conf
config
config-*
config_data.h*
config.mak
config.mak.autogen
conmakehash
consolemap_deftbl.c*
cpustr.h
crc32table.h*
cscope.*
defkeymap.c
devlist.h*
devicetable-offsets.h
dnotify_test
dslm
dtc
elf2ecoff
elfconfig.h*
evergreen_reg_safe.h
fixdep
flask.h
fore200e_mkfirm
fore200e_pca_fw.c*
gconf
gconf.glade.h
gen-devlist
gen_crc32table
gen_init_cpio
generated
genheaders
genksyms
*_gray256.c
hpet_example
hugepage-mmap
hugepage-shm
ihex2fw
inat-tables.c
initramfs_list
int16.c
int1.c
int2.c
int32.c
int4.c
int8.c
kallsyms
kconfig
keywords.c
ksym.c*
ksym.h*
kxgettext
*lex.c
*lex.*.c
linux
logo_*.c
logo_*_clut224.c
logo_*_mono.c
lxdialog
mach-types
mach-types.h
machtypes.h
map
map_hugetlb
mconf
miboot*
mk_elfconfig
mkboot
mkbugboot
mkcpustr
mkdep
mkprep
mkregtable
mktables
mktree
modpost
modules.builtin
modules.order
modversions.h*
nconf
ncscope.*
offset.h
oui.c*
page-types
parse.c
parse.h
patches*
pca200e.bin
pca200e_ecd.bin2
perf.data
perf.data.old
perf-archive
piggyback
piggy.gzip
piggy.S
pnmtologo
ppc_defs.h*
pss_boot.h
qconf
r100_reg_safe.h
r200_reg_safe.h
r300_reg_safe.h
r420_reg_safe.h
r600_reg_safe.h
randomize_layout_hash.h
randomize_layout_seed.h
recordmcount
relocs
rlim_names.h
rn50_reg_safe.h
rs600_reg_safe.h
rv515_reg_safe.h
series
setup
setup.bin
setup.elf
sortextable
sImage
sm_tbl*
split-include
syscalltab.h
tables.c
tags
test_get_len
tftpboot.img
timeconst.h
times.h*
trix_boot.h
utsrelease.h*
vdso-syms.lds
vdso.lds
vdso32-int80-syms.lds
vdso32-syms.lds
vdso32-syscall-syms.lds
vdso32-sysenter-syms.lds
vdso32.lds
vdso32.so.dbg
vdso64.lds
vdso64.so.dbg
version.h*
vmImage
vmlinux
vmlinux-*
vmlinux.aout
vmlinux.bin.all
vmlinux.lds
vmlinuz
voffset.h
vsyscall.lds
vsyscall_32.lds
wanxlfw.inc
uImage
unifdef
wakeup.bin
wakeup.elf
wakeup.lds
zImage*
zoffset.h
back to top