sort by:
Revision Author Date Message Commit Date
4ad9921 printk: finalize records with trailing newlines Any record with a trailing newline (LOG_NEWLINE flag) cannot be continued because the newline has been stripped and will not be visible if the message is appended. This was already handled correctly when committing in log_output() but was not handled correctly when committing in log_store(). Fixes: f5f022e53b87 ("printk: reimplement log_cont using record extension") Link: https://lore.kernel.org/r/20201126114836.14750-1-john.ogness@linutronix.de Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: John Ogness <john.ogness@linutronix.de> Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> 27 November 2020, 10:58:54 UTC
8119c43 Merge tag 'printk-for-5.10-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk fix from Petr Mladek: "Prevent overflow in the new lockless ringbuffer" * tag 'printk-for-5.10-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: printk: ringbuffer: Wrong data pointer when appending small string 16 October 2020, 19:52:37 UTC
49dc6fb Merge tag 'kgdb-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux Pull kgdb updates from Daniel Thompson: "A fairly modest set of changes for this cycle. Of particular note are an earlycon fix from Doug Anderson and my own changes to get kgdb/kdb to honour the kprobe blocklist. The later creates a safety rail that strongly encourages developers not to place breakpoints in, for example, arch specific trap handling code. Also included are a couple of small fixes and tweaks: an API update, eliminate a coverity dead code warning, improved handling of search during multi-line printk and a couple of typo corrections" * tag 'kgdb-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux: kdb: Fix pager search for multi-line strings kernel: debug: Centralize dbg_[de]activate_sw_breakpoints kgdb: Add NOKPROBE labels on the trap handler functions kgdb: Honour the kprobe blocklist when setting breakpoints kernel/debug: Fix spelling mistake in debug_core.c kdb: Use newer api for tasklist scanning kgdb: Make "kgdbcon" work properly with "kgdb_earlycon" kdb: remove unnecessary null check of dbg_io_ops 16 October 2020, 19:47:18 UTC
09a31a7 Merge tag 'mips_5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS updates from Thomas Bogendoerfer: - removed support for PNX833x alias NXT_STB22x - included Ingenic SoC support into generic MIPS kernels - added support for new Ingenic SoCs - converted workaround selection to use Kconfig - replaced old boot mem functions by memblock_* - enabled COP2 usage in kernel for Loongson64 to make use of 16byte load/stores possible - cleanups and fixes * tag 'mips_5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (92 commits) MIPS: DEC: Restore bootmem reservation for firmware working memory area MIPS: dec: fix section mismatch bcm963xx_tag.h: fix duplicated word mips: ralink: enable zboot support MIPS: ingenic: Remove CPU_SUPPORTS_HUGEPAGES MIPS: cpu-probe: remove MIPS_CPU_BP_GHIST option bit MIPS: cpu-probe: introduce exclusive R3k CPU probe MIPS: cpu-probe: move fpu probing/handling into its own file MIPS: replace add_memory_region with memblock MIPS: Loongson64: Clean up numa.c MIPS: Loongson64: Select SMP in Kconfig to avoid build error mips: octeon: Add Ubiquiti E200 and E220 boards MIPS: SGI-IP28: disable use of ll/sc in kernel MIPS: tx49xx: move tx4939_add_memory_regions into only user MIPS: pgtable: Remove used PAGE_USERIO define MIPS: alchemy: Share prom_init implementation MIPS: alchemy: Fix build breakage, if TOUCHSCREEN_WM97XX is disabled MIPS: process: include exec.h header in process.c MIPS: process: Add prototype for function arch_dup_task_struct MIPS: idle: Add prototype for function check_wait ... 16 October 2020, 19:40:55 UTC
847d428 Merge tag 's390-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Remove address space overrides using set_fs() - Convert to generic vDSO - Convert to generic page table dumper - Add ARCH_HAS_DEBUG_WX support - Add leap seconds handling support - Add NVMe firmware-assisted kernel dump support - Extend NVMe boot support with memory clearing control and addition of kernel parameters - AP bus and zcrypt api code rework. Add adapter configure/deconfigure interface. Extend debug features. Add failure injection support - Add ECC secure private keys support - Add KASan support for running protected virtualization host with 4-level paging - Utilize destroy page ultravisor call to speed up secure guests shutdown - Implement ioremap_wc() and ioremap_prot() with MIO in PCI code - Various checksum improvements - Other small various fixes and improvements all over the code * tag 's390-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (85 commits) s390/uaccess: fix indentation s390/uaccess: add default cases for __put_user_fn()/__get_user_fn() s390/zcrypt: fix wrong format specifications s390/kprobes: move insn_page to text segment s390/sie: fix typo in SIGP code description s390/lib: fix kernel doc for memcmp() s390/zcrypt: Introduce Failure Injection feature s390/zcrypt: move ap_msg param one level up the call chain s390/ap/zcrypt: revisit ap and zcrypt error handling s390/ap: Support AP card SCLP config and deconfig operations s390/sclp: Add support for SCLP AP adapter config/deconfig s390/ap: add card/queue deconfig state s390/ap: add error response code field for ap queue devices s390/ap: split ap queue state machine state from device state s390/zcrypt: New config switch CONFIG_ZCRYPT_DEBUG s390/zcrypt: introduce msg tracking in zcrypt functions s390/startup: correct early pgm check info formatting s390: remove orphaned extern variables declarations s390/kasan: make sure int handler always run with DAT on s390/ipl: add support to control memory clearing for nvme re-IPL ... 16 October 2020, 19:36:38 UTC
96685f8 Merge tag 'powerpc-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - A series from Nick adding ARCH_WANT_IRQS_OFF_ACTIVATE_MM & selecting it for powerpc, as well as a related fix for sparc. - Remove support for PowerPC 601. - Some fixes for watchpoints & addition of a new ptrace flag for detecting ISA v3.1 (Power10) watchpoint features. - A fix for kernels using 4K pages and the hash MMU on bare metal Power9 systems with > 16TB of RAM, or RAM on the 2nd node. - A basic idle driver for shallow stop states on Power10. - Tweaks to our sched domains code to better inform the scheduler about the hardware topology on Power9/10, where two SMT4 cores can be presented by firmware as an SMT8 core. - A series doing further reworks & cleanups of our EEH code. - Addition of a filter for RTAS (firmware) calls done via sys_rtas(), to prevent root from overwriting kernel memory. - Other smaller features, fixes & cleanups. Thanks to: Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Athira Rajeev, Biwen Li, Cameron Berkenpas, Cédric Le Goater, Christophe Leroy, Christoph Hellwig, Colin Ian King, Daniel Axtens, David Dai, Finn Thain, Frederic Barrat, Gautham R. Shenoy, Greg Kurz, Gustavo Romero, Ira Weiny, Jason Yan, Joel Stanley, Jordan Niethe, Kajol Jain, Konrad Rzeszutek Wilk, Laurent Dufour, Leonardo Bras, Liu Shixin, Luca Ceresoli, Madhavan Srinivasan, Mahesh Salgaonkar, Nathan Lynch, Nicholas Mc Guire, Nicholas Piggin, Nick Desaulniers, Oliver O'Halloran, Pedro Miraglia Franco de Carvalho, Pratik Rajesh Sampat, Qian Cai, Qinglang Miao, Ravi Bangoria, Russell Currey, Satheesh Rajendran, Scott Cheloha, Segher Boessenkool, Srikar Dronamraju, Stan Johnson, Stephen Kitt, Stephen Rothwell, Thiago Jung Bauermann, Tyrel Datwyler, Vaibhav Jain, Vaidyanathan Srinivasan, Vasant Hegde, Wang Wensheng, Wolfram Sang, Yang Yingliang, zhengbin. * tag 'powerpc-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (228 commits) Revert "powerpc/pci: unmap legacy INTx interrupts when a PHB is removed" selftests/powerpc: Fix eeh-basic.sh exit codes cpufreq: powernv: Fix frame-size-overflow in powernv_cpufreq_reboot_notifier powerpc/time: Make get_tb() common to PPC32 and PPC64 powerpc/time: Make get_tbl() common to PPC32 and PPC64 powerpc/time: Remove get_tbu() powerpc/time: Avoid using get_tbl() and get_tbu() internally powerpc/time: Make mftb() common to PPC32 and PPC64 powerpc/time: Rename mftbl() to mftb() powerpc/32s: Remove #ifdef CONFIG_PPC_BOOK3S_32 in head_book3s_32.S powerpc/32s: Rename head_32.S to head_book3s_32.S powerpc/32s: Setup the early hash table at all time. powerpc/time: Remove ifdef in get_dec() and set_dec() powerpc: Remove get_tb_or_rtc() powerpc: Remove __USE_RTC() powerpc: Tidy up a bit after removal of PowerPC 601. powerpc: Remove support for PowerPC 601 powerpc: Remove PowerPC 601 powerpc: Drop SYNC_601() ISYNC_601() and SYNC() powerpc: Remove CONFIG_PPC601_SYNC_FIX ... 16 October 2020, 19:21:15 UTC
c4cf498 Merge branch 'akpm' (patches from Andrew) Merge more updates from Andrew Morton: "155 patches. Subsystems affected by this patch series: mm (dax, debug, thp, readahead, page-poison, util, memory-hotplug, zram, cleanups), misc, core-kernel, get_maintainer, MAINTAINERS, lib, bitops, checkpatch, binfmt, ramfs, autofs, nilfs, rapidio, panic, relay, kgdb, ubsan, romfs, and fault-injection" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (155 commits) lib, uaccess: add failure injection to usercopy functions lib, include/linux: add usercopy failure capability ROMFS: support inode blocks calculation ubsan: introduce CONFIG_UBSAN_LOCAL_BOUNDS for Clang sched.h: drop in_ubsan field when UBSAN is in trap mode scripts/gdb/tasks: add headers and improve spacing format scripts/gdb/proc: add struct mount & struct super_block addr in lx-mounts command kernel/relay.c: drop unneeded initialization panic: dump registers on panic_on_warn rapidio: fix the missed put_device() for rio_mport_add_riodev rapidio: fix error handling path nilfs2: fix some kernel-doc warnings for nilfs2 autofs: harden ioctl table ramfs: fix nommu mmap with gaps in the page cache mm: remove the now-unnecessary mmget_still_valid() hack mm/gup: take mmap_lock in get_dump_page() binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot coredump: rework elf/elf_fdpic vma_dump_size() into common helper coredump: refactor page range dumping into common helper coredump: let dump_emit() bail out on short writes ... 16 October 2020, 18:31:55 UTC
4d0e9df lib, uaccess: add failure injection to usercopy functions To test fault-tolerance of user memory access functions, introduce fault injection to usercopy functions. If a failure is expected return either -EFAULT or the total amount of bytes that were not copied. Signed-off-by: Albert van der Linde <alinde@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Marco Elver <elver@google.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20200831171733.955393-3-alinde@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:22 UTC
2c739ce lib, include/linux: add usercopy failure capability Patch series "add fault injection to user memory access", v3. The goal of this series is to improve testing of fault-tolerance in usages of user memory access functions, by adding support for fault injection. syzkaller/syzbot are using the existing fault injection modes and will use this particular feature also. The first patch adds failure injection capability for usercopy functions. The second changes usercopy functions to use this new failure capability (copy_from_user, ...). The third patch adds get/put/clear_user failures to x86. This patch (of 3): Add a failure injection capability to improve testing of fault-tolerance in usages of user memory access functions. Add CONFIG_FAULT_INJECTION_USERCOPY to enable faults in usercopy functions. The should_fail_usercopy function is to be called by these functions (copy_from_user, get_user, ...) in order to fail or not. Signed-off-by: Albert van der Linde <alinde@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: Christoph Hellwig <hch@lst.de> Link: http://lkml.kernel.org/r/20200831171733.955393-1-alinde@google.com Link: http://lkml.kernel.org/r/20200831171733.955393-2-alinde@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:22 UTC
d9bc85d ROMFS: support inode blocks calculation When use 'stat' tool to display file status, the 'Blocks' field always in '0', this is not good for tool 'du'(e.g.: busybox 'du'), it always output '0' size for the files under ROMFS since such tool calculates number of 512B Blocks. This patch calculates approx. number of 512B blocks based on inode size. Signed-off-by: Libing Zhou <libing.zhou@nokia-sbell.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: David Howells <dhowells@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Link: http://lkml.kernel.org/r/20200811052606.4243-1-libing.zhou@nokia-sbell.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:22 UTC
6a6155f ubsan: introduce CONFIG_UBSAN_LOCAL_BOUNDS for Clang When the kernel is compiled with Clang, -fsanitize=bounds expands to -fsanitize=array-bounds and -fsanitize=local-bounds. Enabling -fsanitize=local-bounds with Clang has the unfortunate side-effect of inserting traps; this goes back to its original intent, which was as a hardening and not a debugging feature [1]. The same feature made its way into -fsanitize=bounds, but the traps remained. For that reason, -fsanitize=bounds was split into 'array-bounds' and 'local-bounds' [2]. Since 'local-bounds' doesn't behave like a normal sanitizer, enable it with Clang only if trapping behaviour was requested by CONFIG_UBSAN_TRAP=y. Add the UBSAN_BOUNDS_LOCAL config to Kconfig.ubsan to enable the 'local-bounds' option by default when UBSAN_TRAP is enabled. [1] http://lists.llvm.org/pipermail/llvm-dev/2012-May/049972.html [2] http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20131021/091536.html Suggested-by: Marco Elver <elver@google.com> Signed-off-by: George Popescu <georgepope@android.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: David Brazdil <dbrazdil@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Michal Marek <michal.lkml@markovi.net> Cc: Nathan Chancellor <natechancellor@gmail.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lkml.kernel.org/r/20200922074330.2549523-1-georgepope@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:22 UTC
5cf53f3 sched.h: drop in_ubsan field when UBSAN is in trap mode in_ubsan field of task_struct is only used in lib/ubsan.c, which in its turn is used only `ifneq ($(CONFIG_UBSAN_TRAP),y)`. Removing unnecessary field from a task_struct will help preserve the ABI between vanilla and CONFIG_UBSAN_TRAP'ed kernels. In particular, this will help enabling bounds sanitizer transparently for Android's GKI. Signed-off-by: Elena Petrova <lenaptr@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Kees Cook <keescook@chromium.org> Cc: Jann Horn <jannh@google.com> Link: https://lkml.kernel.org/r/20200910134802.3160311-1-lenaptr@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:22 UTC
4fbe310 scripts/gdb/tasks: add headers and improve spacing format With the patch. <e.g. o/p> TASK PID COMM 0xffffffff82c2b8c0 0 swapper/0 0xffff888a0ba20040 1 systemd 0xffff888a0ba24040 2 kthreadd 0xffff888a0ba28040 3 rcu_gp w/o 0xffffffff82c2b8c0 <init_task> 0 swapper/0 0xffff888a0ba20040 1 systemd 0xffff888a0ba24040 2 kthreadd 0xffff888a0ba28040 3 rcu_gp Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com> Cc: Kieran Bingham <kbingham@kernel.org> Link: http://lkml.kernel.org/r/54c868c79b5fc364a8be7799891934a6fe6d1464.1597742951.git.riteshh@linux.ibm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:22 UTC
998ec76 scripts/gdb/proc: add struct mount & struct super_block addr in lx-mounts command This is many times found useful while debugging some FS related issue. <e.g. output> mount super_block devname pathname fstype options 0xffff888a0bfa4b40 0xffff888a0bfc1000 none / rootfs rw 0 0 0xffff888a033f75c0 0xffff8889fcf65000 /dev/root / ext4 rw,relatime 0 0 0xffff8889fc8ce040 0xffff888a0bb51000 devtmpfs /dev devtmpfs rw,relatime 0 0 Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com> Cc: Kieran Bingham <kbingham@kernel.org> Link: http://lkml.kernel.org/r/a3c4177e1597b3e06d66d55e07d72c0c46a03571.1597742951.git.riteshh@linux.ibm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:22 UTC
ac05b7a kernel/relay.c: drop unneeded initialization The variable 'consumed' is initialized with the consumed count but immediately after that the consumed count is updated and assigned to 'consumed' again thus overwriting the previous value. So, drop the unneeded initialization. Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20201005205727.1147-1-sudipm.mukherjee@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:22 UTC
3f388f2 panic: dump registers on panic_on_warn Currently we print stack and registers for ordinary warnings but we do not for panic_on_warn which looks as oversight - panic() will reboot the machine but won't print registers. This moves printing of registers and modules earlier. This does not move the stack dumping as panic() dumps it. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Douglas Anderson <dianders@chromium.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Rafael Aquini <aquini@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Link: https://lkml.kernel.org/r/20200804095054.68724-1-aik@ozlabs.ru Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:22 UTC
85094c0 rapidio: fix the missed put_device() for rio_mport_add_riodev rio_mport_add_riodev() misses to call put_device() when the device already exists. Add the missed function call to fix it. Fixes: e8de370188d0 ("rapidio: add mport char device driver") Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Alexandre Bounine <alex.bou9@gmail.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Kees Cook <keescook@chromium.org> Cc: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Link: https://lkml.kernel.org/r/20200922072525.42330-1-jingxiangfeng@huawei.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:22 UTC
fa63f08 rapidio: fix error handling path rio_dma_transfer() attempts to clamp the return value of pin_user_pages_fast() to be >= 0. However, the attempt fails because nr_pages is overridden a few lines later, and restored to the undesirable -ERRNO value. The return value is ultimately stored in nr_pages, which in turn is passed to unpin_user_pages(), which expects nr_pages >= 0, else, disaster. Fix this by fixing the nesting of the assignment to nr_pages: nr_pages should be clamped to zero if pin_user_pages_fast() returns -ERRNO, or set to the return value of pin_user_pages_fast(), otherwise. [jhubbard@nvidia.com: new changelog] Fixes: e8de370188d09 ("rapidio: add mport char device driver") Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Alexandre Bounine <alex.bou9@gmail.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lkml.kernel.org/r/1600227737-20785-1-git-send-email-jrdr.linux@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:22 UTC
64ead52 nilfs2: fix some kernel-doc warnings for nilfs2 Fixes the following W=1 kernel build warning(s): fs/nilfs2/bmap.c:378: warning: Excess function parameter 'bhp' description in 'nilfs_bmap_assign' fs/nilfs2/cpfile.c:907: warning: Excess function parameter 'status' description in 'nilfs_cpfile_change_cpmode' fs/nilfs2/cpfile.c:946: warning: Excess function parameter 'stat' description in 'nilfs_cpfile_get_stat' fs/nilfs2/page.c:76: warning: Excess function parameter 'inode' description in 'nilfs_forget_buffer' fs/nilfs2/sufile.c:563: warning: Excess function parameter 'stat' description in 'nilfs_sufile_get_stat' Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/1601386269-2423-1-git-send-email-konishi.ryusuke@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:22 UTC
589f6b5 autofs: harden ioctl table The table of ioctl functions should be marked const in order to put them in read-only memory, and we should use array_index_nospec() to avoid speculation disclosing the contents of kernel memory to userspace. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Ian Kent <raven@themaw.net> Link: https://lkml.kernel.org/r/20200818122203.GO17456@casper.infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:22 UTC
50b7d85 ramfs: fix nommu mmap with gaps in the page cache ramfs needs to check that pages are both physically contiguous and contiguous in the file. If the page cache happens to have, eg, page A for index 0 of the file, no page for index 1, and page A+1 for index 2, then an mmap of the first two pages of the file will succeed when it should fail. Fixes: 642fb4d1f1dd ("[PATCH] NOMMU: Provide shared-writable mmap support on ramfs") Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: David Howells <dhowells@redhat.com> Link: https://lkml.kernel.org/r/20200914122239.GO6583@casper.infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:22 UTC
4d45e75 mm: remove the now-unnecessary mmget_still_valid() hack The preceding patches have ensured that core dumping properly takes the mmap_lock. Thanks to that, we can now remove mmget_still_valid() and all its users. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Hugh Dickins <hughd@google.com> Link: http://lkml.kernel.org/r/20200827114932.3572699-8-jannh@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:22 UTC
7f3bfab mm/gup: take mmap_lock in get_dump_page() Properly take the mmap_lock before calling into the GUP code from get_dump_page(); and play nice, allowing the GUP code to drop the mmap_lock if it has to sleep. As Linus pointed out, we don't actually need the VMA because __get_user_pages() will flush the dcache for us if necessary. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Hugh Dickins <hughd@google.com> Link: http://lkml.kernel.org/r/20200827114932.3572699-7-jannh@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:21 UTC
a07279c binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot In both binfmt_elf and binfmt_elf_fdpic, use a new helper dump_vma_snapshot() to take a snapshot of the VMA list (including the gate VMA, if we have one) while protected by the mmap_lock, and then use that snapshot instead of walking the VMA list without locking. An alternative approach would be to keep the mmap_lock held across the entire core dumping operation; however, keeping the mmap_lock locked while we may be blocked for an unbounded amount of time (e.g. because we're dumping to a FUSE filesystem or so) isn't really optimal; the mmap_lock blocks things like the ->release handler of userfaultfd, and we don't really want critical system daemons to grind to a halt just because someone "gifted" them SCM_RIGHTS to an eternally-locked userfaultfd, or something like that. Since both the normal ELF code and the FDPIC ELF code need this functionality (and if any other binfmt wants to add coredump support in the future, they'd probably need it, too), implement this with a common helper in fs/coredump.c. A downside of this approach is that we now need a bigger amount of kernel memory per userspace VMA in the normal ELF case, and that we need O(n) kernel memory in the FDPIC ELF case at all; but 40 bytes per VMA shouldn't be terribly bad. There currently is a data race between stack expansion and anything that reads ->vm_start or ->vm_end under the mmap_lock held in read mode; to mitigate that for core dumping, take the mmap_lock in write mode when taking a snapshot of the VMA hierarchy. (If we only took the mmap_lock in read mode, we could end up with a corrupted core dump if someone does get_user_pages_remote() concurrently. Not really a major problem, but taking the mmap_lock either way works here, so we might as well avoid the issue.) (This doesn't do anything about the existing data races with stack expansion in other mm code.) Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Hugh Dickins <hughd@google.com> Link: http://lkml.kernel.org/r/20200827114932.3572699-6-jannh@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:21 UTC
429a22e coredump: rework elf/elf_fdpic vma_dump_size() into common helper At the moment, the binfmt_elf and binfmt_elf_fdpic code have slightly different code to figure out which VMAs should be dumped, and if so, whether the dump should contain the entire VMA or just its first page. Eliminate duplicate code by reworking the binfmt_elf version into a generic core dumping helper in coredump.c. As part of that, change the heuristic for detecting executable/library header pages to check whether the inode is executable instead of looking at the file mode. This is less problematic in terms of locking because it lets us avoid get_user() under the mmap_sem. (And arguably it looks nicer and makes more sense in generic code.) Adjust a little bit based on the binfmt_elf_fdpic version: ->anon_vma is only meaningful under CONFIG_MMU, otherwise we have to assume that the VMA has been written to. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Hugh Dickins <hughd@google.com> Link: http://lkml.kernel.org/r/20200827114932.3572699-5-jannh@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:21 UTC
afc63a9 coredump: refactor page range dumping into common helper Both fs/binfmt_elf.c and fs/binfmt_elf_fdpic.c need to dump ranges of pages into the coredump file. Extract that logic into a common helper. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Hugh Dickins <hughd@google.com> Link: http://lkml.kernel.org/r/20200827114932.3572699-4-jannh@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:21 UTC
df0c09c coredump: let dump_emit() bail out on short writes dump_emit() has a retry loop, but there seems to be no way for that retry logic to actually be used; and it was also buggy, writing the same data repeatedly after a short write. Let's just bail out on a short write. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Hugh Dickins <hughd@google.com> Link: http://lkml.kernel.org/r/20200827114932.3572699-3-jannh@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:21 UTC
8f942ee binfmt_elf_fdpic: stop using dump_emit() on user pointers on !MMU Patch series "Fix ELF / FDPIC ELF core dumping, and use mmap_lock properly in there", v5. At the moment, we have that rather ugly mmget_still_valid() helper to work around <https://crbug.com/project-zero/1790>: ELF core dumping doesn't take the mmap_sem while traversing the task's VMAs, and if anything (like userfaultfd) then remotely messes with the VMA tree, fireworks ensue. So at the moment we use mmget_still_valid() to bail out in any writers that might be operating on a remote mm's VMAs. With this series, I'm trying to get rid of the need for that as cleanly as possible. ("cleanly" meaning "avoid holding the mmap_lock across unbounded sleeps".) Patches 1, 2, 3 and 4 are relatively unrelated cleanups in the core dumping code. Patches 5 and 6 implement the main change: Instead of repeatedly accessing the VMA list with sleeps in between, we snapshot it at the start with proper locking, and then later we just use our copy of the VMA list. This ensures that the kernel won't crash, that VMA metadata in the coredump is consistent even in the presence of concurrent modifications, and that any virtual addresses that aren't being concurrently modified have their contents show up in the core dump properly. The disadvantage of this approach is that we need a bit more memory during core dumping for storing metadata about all VMAs. At the end of the series, patch 7 removes the old workaround for this issue (mmget_still_valid()). I have tested: - Creating a simple core dump on X86-64 still works. - The created coredump on X86-64 opens in GDB and looks plausible. - X86-64 core dumps contain the first page for executable mappings at offset 0, and don't contain the first page for non-executable file mappings or executable mappings at offset !=0. - NOMMU 32-bit ARM can still generate plausible-looking core dumps through the FDPIC implementation. (I can't test this with GDB because GDB is missing some structure definition for nommu ARM, but I've poked around in the hexdump and it looked decent.) This patch (of 7): dump_emit() is for kernel pointers, and VMAs describe userspace memory. Let's be tidy here and avoid accessing userspace pointers under KERNEL_DS, even if it probably doesn't matter much on !MMU systems - especially given that it looks like we can just use the same get_dump_page() as on MMU if we move it out of the CONFIG_MMU block. One small change we have to make in get_dump_page() is to use __get_user_pages_locked() instead of __get_user_pages(), since the latter doesn't exist on nommu. On mmu builds, __get_user_pages_locked() will just call __get_user_pages() for us. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Hugh Dickins <hughd@google.com> Link: http://lkml.kernel.org/r/20200827114932.3572699-1-jannh@google.com Link: http://lkml.kernel.org/r/20200827114932.3572699-2-jannh@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:21 UTC
206e22f tools/testing/selftests: add self-test for verifying load alignment This produces a PIE binary with a variety of p_align requirements, suitable for verifying that the load address meets that alignment requirement. Signed-off-by: Chris Kennelly <ckennelly@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: David Rientjes <rientjes@google.com> Cc: Fangrui Song <maskray@google.com> Cc: Hugh Dickens <hughd@google.com> Cc: Ian Rogers <irogers@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Sandeep Patil <sspatil@google.com> Cc: Song Liu <songliubraving@fb.com> Cc: Suren Baghdasaryan <surenb@google.com> Link: https://lkml.kernel.org/r/20200820170541.1132271-3-ckennelly@google.com Link: https://lkml.kernel.org/r/20200821233848.3904680-3-ckennelly@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:21 UTC
ce81bb2 fs/binfmt_elf: use PT_LOAD p_align values for suitable start address Patch series "Selecting Load Addresses According to p_align", v3. The current ELF loading mechancism provides page-aligned mappings. This can lead to the program being loaded in a way unsuitable for file-backed, transparent huge pages when handling PIE executables. While specifying -z,max-page-size=0x200000 to the linker will generate suitably aligned segments for huge pages on x86_64, the executable needs to be loaded at a suitably aligned address as well. This alignment requires the binary's cooperation, as distinct segments need to be appropriately paddded to be eligible for THP. For binaries built with increased alignment, this limits the number of bits usable for ASLR, but provides some randomization over using fixed load addresses/non-PIE binaries. This patch (of 2): The current ELF loading mechancism provides page-aligned mappings. This can lead to the program being loaded in a way unsuitable for file-backed, transparent huge pages when handling PIE executables. For binaries built with increased alignment, this limits the number of bits usable for ASLR, but provides some randomization over using fixed load addresses/non-PIE binaries. Tested by verifying program with -Wl,-z,max-page-size=0x200000 loading. [akpm@linux-foundation.org: fix max() warning] [ckennelly@google.com: augment comment] Link: https://lkml.kernel.org/r/20200821233848.3904680-2-ckennelly@google.com Signed-off-by: Chris Kennelly <ckennelly@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: David Rientjes <rientjes@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Hugh Dickens <hughd@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sandeep Patil <sspatil@google.com> Cc: Fangrui Song <maskray@google.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Shuah Khan <shuah@kernel.org> Link: https://lkml.kernel.org/r/20200820170541.1132271-1-ckennelly@google.com Link: https://lkml.kernel.org/r/20200820170541.1132271-2-ckennelly@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:21 UTC
48ca2d8 checkpatch: add new warnings to author signoff checks. The author signed-off-by checks are currently very vague. Cases like same name or same address are not handled separately. For example, running checkpatch on commit be6577af0cef ("parisc: Add atomic64_set_release() define to avoid CPU soft lockups"), gives: WARNING: Missing Signed-off-by: line by nominal patch author 'John David Anglin <dave.anglin@bell.net>' The signoff line was: "Signed-off-by: Dave Anglin <dave.anglin@bell.net>" Clearly the author has signed off but with a slightly different version of his name. A more appropriate warning would have been to point out at the name mismatch instead. Previously, the values assumed by $authorsignoff were either 0 or 1 to indicate whether a proper sign off by author is present. Extended the checks to handle four new cases. $authorsignoff values now denote the following: 0: Missing sign off by patch author. 1: Sign off present and identical. 2: Addresses and names match, but comments differ. "James Watson(JW) <james@gmail.com>", "James Watson <james@gmail.com>" 3: Addresses match, but names are different. "James Watson <james@gmail.com>", "James <james@gmail.com>" 4: Names match, but addresses are different. "James Watson <james@watson.com>", "James Watson <james@gmail.com>" 5: Names match, addresses excluding subaddress details (RFC 5233) match. "James Watson <james@gmail.com>", "James Watson <james+a@gmail.com>" Also introduced a new message type FROM_SIGN_OFF_MISMATCH for cases 2, 3, 4 and 5. Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Dwaipayan Ray <dwaipayanray1@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Joe Perches <joe@perches.com> Link: https://lore.kernel.org/linux-kernel-mentees/c1ca28e77e8e3bfa7aadf3efa8ed70f97a9d369c.camel@perches.com/ Link: https://lkml.kernel.org/r/20201007192029.551744-1-dwaipayanray1@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:21 UTC
c70735c checkpatch: fix false positive on empty block comment lines To avoid false positives in presence of SPDX-License-Identifier in networking files it is required to increase the leeway for empty block comment lines by one line. For example, checking drivers/net/loopback.c which starts with // SPDX-License-Identifier: GPL-2.0-or-later /* * INET An implementation of the TCP/IP protocol suite for the LINUX rsults in an unnecessary warning WARNING: networking block comments don't use an empty /* line, use /* Comment... +/* + * INET An implementation of the TCP/IP protocol suite for the LINUX Signed-off-by: Łukasz Stelmach <l.stelmach@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Joe Perches <joe@perches.com> Cc: Bartłomiej Żolnierkiewicz <b.zolnierkie@samsung.co> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lkml.kernel.org/r/20201006083509.19934-1-l.stelmach@samsung.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:21 UTC
2e44e80 checkpatch: fix multi-statement macro checks for while blocks. Checkpatch.pl doesn't have a check for excluding while (...) {...} blocks from MULTISTATEMENT_MACRO_USE_DO_WHILE error. For example, running checkpatch.pl on the file mm/maccess.c in the kernel generates the following error: ERROR: Macros with complex values should be enclosed in parentheses +#define copy_from_kernel_nofault_loop(dst, src, len, type, err_label) \ + while (len >= sizeof(type)) { \ + __get_kernel_nofault(dst, src, type, err_label); \ + dst += sizeof(type); \ + src += sizeof(type); \ + len -= sizeof(type); \ + } The error is misleading for this case. Enclosing it in parentheses doesn't make any sense. Checkpatch already has an exception list for such common macro types. Added a new exception for while (...) {...} style blocks to the same. In addition, the brace flatten logic was modified by changing the substitution characters from "1" to "1u". This was done to ensure that macros in the form "#define foo(bar) while(bar){bar--;}" were also correctly procecssed. Link: https://lore.kernel.org/linux-kernel-mentees/dc985938aa3986702815a0bd68dfca8a03c85447.camel@perches.com/ Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Dwaipayan Ray <dwaipayanray1@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20201001171903.312021-1-dwaipayanray1@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:21 UTC
a0154cd checkpatch: emit a warning on embedded filenames Embedding the complete filename path inside the file isn't particularly useful as often the path is moved around and becomes incorrect. Emit a warning when the source contains the filename. [akpm@linux-foundation.org: remove stray " di"] Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/1fd5f9188a14acdca703ca00301ee323de672a8d.camel@perches.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:21 UTC
e7f929f checkpatch: extend author Signed-off-by check for split From: header Checkpatch did not handle cases where the author From: header was split into multiple lines. The author identity could not be resolved and checkpatch generated a false NO_AUTHOR_SIGN_OFF warning. A typical example is commit e33bcbab16d1 ("tee: add support for session's client UUID generation"). When checkpatch was run on this commit, it displayed: "WARNING:NO_AUTHOR_SIGN_OFF: Missing Signed-off-by: line by nominal patch author ''" This was due to split header lines not being handled properly and the author himself wrote in commit cd2614967d8b ("checkpatch: warn if missing author Signed-off-by"): "Split From: headers are not fully handled: only the first part is compared." Support split From: headers by correctly parsing the header extension lines. RFC 5322, Section-2.2.3 stated that each extended line must start with a WSP character (a space or htab). The solution was therefore to concatenate the lines which start with a WSP to get the correct long header. Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Dwaipayan Ray <dwaipayanray1@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Reviewed-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Acked-by: Joe Perches <joe@perches.com> Link: https://lore.kernel.org/linux-kernel-mentees/f5d8124e54a50480b0a9fa638787bc29b6e09854.camel@perches.com/ Link: https://lkml.kernel.org/r/20200921085436.63003-1-dwaipayanray1@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:21 UTC
f5f6132 checkpatch: allow not using -f with files that are in git If a file exists in git and checkpatch is used without the -f flag for scanning a file, then checkpatch will scan the file assuming it's a patch and emit: ERROR: Does not appear to be a unified-diff format patch Change the behavior to assume the -f flag if the file exists in git. [joe@perches.com: fix git "fatal" warning if file argument outside kernel tree] Link: https://lkml.kernel.org/r/b6afa04112d450c2fc120a308d706acd60cee294.camel@perches.com Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Julia Lawall <julia.lawall@inria.fr> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Link: https://lkml.kernel.org/r/45b81a48e1568bd0126a96f5046eb7aaae9b83c9.camel@perches.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:21 UTC
99ca38c checkpatch: warn on self-assignments The uninitialized_var() macro was removed recently via commit 63a0895d960a ("compiler: Remove uninitialized_var() macro") as it's not a particularly useful warning and its use can "paper over real bugs". Add a checkpatch test to warn on self-assignments as a means to avoid compiler warnings and as a back-door mechanism to reproduce the old uninitialized_var macro behavior. [akpm@linux-foundation.org: coding style fixes] Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Kees Cook <keescook@chromium.org> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Denis Efremov <efremov@linux.com> Cc: Julia Lawall <julia.lawall@inria.fr> Link: https://lkml.kernel.org/r/afc2cffdd315d3e4394af149278df9e8af7f49f4.camel@perches.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:21 UTC
c12093a const_structs.checkpatch: add pinctrl_ops and pinmux_ops All usages of include/linux of these are const pointers, and all instances in the kernel except one, that are not const can be made const (patches have been posted for those separately). Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Joe Perches <joe@perches.com> Cc: Andreas Färber <afaerber@suse.de> Cc: Rikard Falkeborn <rikard.falkeborn@gmail.com> Link: https://lkml.kernel.org/r/20200830224352.37114-1-rikard.falkeborn@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:21 UTC
8020b25 checkpatch: warn if trace_printk and friends are called trace_printk is meant as a debugging tool, and should not be compiled into production code without specific debug Kconfig options enabled, or source code changes, as indicated by the warning that shows up on boot if any trace_printk is called: ** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE ** ** ** ** trace_printk() being used. Allocating extra memory. ** ** ** ** This means that this is a DEBUG kernel and it is ** ** unsafe for production use. ** Let's warn developers when they try to submit such a change. Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Joe Perches <joe@perches.com> Link: https://lkml.kernel.org/r/20200825193600.v2.1.I723c43c155f02f726c97501be77984f1e6bb740a@changeid Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:21 UTC
ed4761f const_structs.checkpatch: add phy_ops All usages of phy_ops in include/linux uses const phy_ops * and all instances of phy_ops in the kernel that are not const already can be made const (patches have been posted for those separately). Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Kishon Vijay Abraham I <kishon@ti.com> Cc: Vinod Koul <vkoul@kernel.org> Link: https://lkml.kernel.org/r/20200824214132.9072-1-rikard.falkeborn@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:21 UTC
40873ab checkpatch: add test for comma use that should be semicolon There are commas used as statement terminations that should typically have used semicolons instead. Only direct assignments or use of a single function or value on a single line are detected by this test. e.g.: foo = bar(), /* typical use is semicolon not comma */ bar = baz(); Add an imperfect test to detect these comma uses. No false positives were found in testing, but many types of false negatives are possible. e.g.: foo = bar() + 1, /* comma use, but not direct assignment */ bar = baz(); Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/3bf27caf462007dfa75647b040ab3191374a59de.camel@perches.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:20 UTC
310cd06 checkpatch: move repeated word test Currently this test only works on .[ch] files. Move the test to check more file types and the commit log. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/180b3b5677771c902b2e2f7a2b7090ede65fe004.camel@perches.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:20 UTC
3e89ad8 checkpatch: add --kconfig-prefix Kconfig allows to customize the CONFIG_ prefix via the $CONFIG_ environment variable. Out-of-tree projects may therefore use Kconfig with a different prefix, or they may use a custom configuration tool which does not use the CONFIG_ prefix at all. Such projects may still want to adhere to the Linux kernel coding style and run checkpatch.pl. One example is OP-TEE [1] which does not use Kconfig but does have configuration options prefixed with CFG_. It also mostly follows the kernel coding style and therefore being able to use checkpatch is quite valuable. To make this possible, add the --kconfig-prefix command line option. [1] https://github.com/OP-TEE/optee_os Signed-off-by: Jerome Forissier <jerome@forissier.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Joe Perches <joe@perches.com> Link: http://lkml.kernel.org/r/20200818081732.800449-1-jerome@forissier.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:20 UTC
004fba1 bitops: use the same mechanism for get_count_order[_long] These two functions share the same logic. Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lkml.kernel.org/r/20200807085837.11697-3-richard.weiyang@linux.alibaba.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:20 UTC
a9eb637 bitops: simplify get_count_order_long() These two cases could be unified into one. Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christian Brauner <christian.brauner@ubuntu.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lkml.kernel.org/r/20200807085837.11697-2-richard.weiyang@linux.alibaba.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:20 UTC
904542d lib/crc32.c: fix trivial typo in preprocessor condition Whether crc32_be needs a lookup table is chosen based on CRC_LE_BITS. Obviously, the _be function should be governed by the _BE_ define. This probably never pops up as it's hard to come up with a configuration where CRC_BE_BITS isn't the same as CRC_LE_BITS and as nobody is using bitwise CRC anyway. Fixes: 46c5801eaf86 ("crc32: bolt on crc32c") Signed-off-by: Tobias Jordan <kernel@cdqe.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Krzysztof Kozlowski <krzk@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://lkml.kernel.org/r/20200923182122.GA3338@agrajag.zerfleddert.de Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:20 UTC
f3c9d0a lib/test_hmm.c: fix an error code in dmirror_allocate_chunk() This is supposed to return false on failure, not a negative error code. Fixes: 170e38548b81 ("mm/hmm/test: use after free in dmirror_allocate_chunk()") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Dan Williams <dan.j.williams@intel.com> Link: https://lkml.kernel.org/r/20201010200812.GA1886610@mwanda Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:20 UTC
e130816 include/linux/list.h: add a macro to test if entry is pointing to the head Add a macro to test if entry is pointing to the head of the list which is useful in cases like: list_for_each_entry(pos, &head, member) { if (cond) break; } if (list_entry_is_head(pos, &head, member)) return -ERRNO; that allows to avoid additional variable to be added to track if loop has not been stopped in the middle. While here, convert list_for_each_entry*() family of macros to use a new one. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com> Link: https://lkml.kernel.org/r/20200929134342.51489-1-andriy.shevchenko@linux.intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:20 UTC
1d33963 lib/percpu_counter.c: use helper macro abs() Use helper macro abs() to simplify the "x >= t || x <= -t" cmp. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20200927122746.5964-1-linmiaohe@huawei.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:20 UTC
6ed9b92 lib/scatterlist.c: avoid a double memset 'sgl' is zeroed a few lines below in 'sg_init_table()'. There is no need to clear it twice. Remove the redundant initialization. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20200920071544.368841-1-christophe.jaillet@wanadoo.fr Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:20 UTC
3264cee lib/idr.c: document that ida_simple_{get,remove}() are deprecated These two functions are deprecated. Users should call ida_alloc() or ida_free() respectively instead. Add documentation to this effect until the macro can be removed. Signed-off-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Tri Vo <trong@android.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox <willy@infradead.org> Link: https://lkml.kernel.org/r/20200910055246.2297797-2-swboyd@chromium.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:20 UTC
3b67426 lib/idr.c: document calling context for IDA APIs mustn't use locks The documentation for these functions indicates that callers don't need to hold a lock while calling them, but that documentation is only in one place under "IDA Usage". Let's state the same information on each IDA function so that it's clear what the calling context requires. Furthermore, let's document ida_simple_get() with the same information so that callers know how this API works. Signed-off-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tri Vo <trong@android.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Matthew Wilcox <willy@infradead.org> Link: https://lkml.kernel.org/r/20200910055246.2297797-1-swboyd@chromium.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:20 UTC
8d8472c lib/mpi/mpi-bit.c: fix spello of "functions" Fix typo/spello of "functions". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/8df15173-a6df-9426-7cad-a2d279bf1170@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:20 UTC
2d04698 lib: test_sysctl: delete duplicated words Drop the repeated word "the". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20200823040520.1999-1-rdunlap@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:20 UTC
408a93a lib: syscall: delete duplicated words Drop the repeated word "the". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20200823040514.26136-1-rdunlap@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:20 UTC
e065650 lib: radix-tree: delete duplicated words Drop the repeated word "be". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20200823040508.26086-1-rdunlap@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:20 UTC
4e20ace lib: earlycpio: delete duplicated words Drop the repeated word "the". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20200823040455.25995-1-rdunlap@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:20 UTC
dde57fe lib: dynamic_queue_limits: delete duplicated words + fix typo Drop the repeated word "the". Fix spelling of "excess". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20200823040449.25946-1-rdunlap@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:20 UTC
2f22385 lib: decompress_bunzip2: delete duplicated words Drop the repeated word "how". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20200823040436.25852-1-rdunlap@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:19 UTC
f1e594a lib: libcrc32c: delete duplicated words Drop the repeated word "the". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20200823040430.25807-1-rdunlap@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:19 UTC
197d6c1 lib: bitmap: delete duplicated words Drop the repeated word "an". Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20200823040424.25760-1-rdunlap@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:19 UTC
32dd8af MAINTAINERS: jarkko.sakkinen@linux.intel.com -> jarkko@kernel.org Use @kernel.org address as the main communications end point. Update the corresponding M-entries and .mailmap (for git shortlog translation). Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Joe Perches <joe@perches.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kees Cook <keescook@chromium.org> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Rob Herring <robh@kernel.org> Link: https://lkml.kernel.org/r/20201015142710.8371-1-jarkko.sakkinen@linux.intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:19 UTC
6343f6b get_maintainer: exclude MAINTAINERS file(s) from --git-fallback MAINTAINERS files generally have no specific maintainer but are updated by individuals for subsystems all over the source tree. Exclude MAINTAINERS file(s) from --git-fallback searches so the unlucky individuals that update the files the most are not shown by default. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Rob Herring <robh@kernel.org> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Link: https://lkml.kernel.org/r/2bacb0a9c06fbb6d56a43bf930e808c74243c908.camel@perches.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:19 UTC
cdfe2d2 get_maintainer: add test for file in VCS It's somewhat common for me to ask get_maintainer to tell me who maintains a patch file rather than the files modified by the patch. Emit a warning if using get_maintainer.pl -f <patchfile> Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/f63229c051567041819f25e76f49d83c6e4c0f71.camel@perches.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:19 UTC
b7621eb kernel: acct.c: fix some kernel-doc nits Fix kernel-doc notation to use the documented Returns: syntax and place the function description for acct_process() on the first line where it should be. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Link: https://lkml.kernel.org/r/b4c33e5d-98e8-0c47-77b6-ac1859f94d7f@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:19 UTC
7b7b8a2 kernel/: fix repeated words in comments Fix multiple occurrences of duplicated words in kernel/. Fix one typo/spello on the same line as a duplicate word. Change one instance of "the the" to "that the". Otherwise just drop one of the repeated words. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/98202fa6-8919-ef63-9efe-c0fad5ca7af1@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:19 UTC
15ec0fc kernel/sys.c: replace do_brk with do_brk_flags in comment of prctl_set_mm_map() Replace do_brk with do_brk_flags in comment of prctl_set_mm_map(), since do_brk was removed in following commit. Fixes: bb177a732c4369 ("mm: do not bug_on on incorrect length in __mm_populate()") Signed-off-by: Liao Pingfang <liao.pingfang@zte.com.cn> Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/1600650751-43127-1-git-send-email-wang.yi59@zte.com.cn Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:19 UTC
b296a6d kernel.h: split out min()/max() et al. helpers kernel.h is being used as a dump for all kinds of stuff for a long time. Here is the attempt to start cleaning it up by splitting out min()/max() et al. helpers. At the same time convert users in header and lib folder to use new header. Though for time being include new header back to kernel.h to avoid twisted indirected includes for other existing users. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Joe Perches <joe@perches.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/20200910164152.GA1891694@smile.fi.intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:19 UTC
ce9bebe fs: configfs: delete repeated words in comments Drop duplicated words {the, that} in comments. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Joel Becker <jlbec@evilplan.org> Cc: Christoph Hellwig <hch@lst.de> Link: https://lkml.kernel.org/r/20200811021826.25032-1-rdunlap@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:19 UTC
ab130f9 mm: rename page_order() to buddy_order() The current page_order() can only be called on pages in the buddy allocator. For compound pages, you have to use compound_order(). This is confusing and led to a bug, so rename page_order() to buddy_order(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20201001152259.14932-2-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:19 UTC
1f0f8c0 include/linux/mmzone.h: remove unused early_pfn_valid() The early_pfn_valid() macro is defined but it is never used. Remove it. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: David Hildenbrand <david@redhat.com> Link: https://lkml.kernel.org/r/20200923162915.26935-1-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:19 UTC
73eb7f9 mm: use helper function put_write_access() In commit 1da177e4c3f4 ("Linux-2.6.12-rc2"), the helper put_write_access() came with the atomic_dec operation of the i_writecount field. But it forgot to use this helper in __vma_link_file() and dup_mmap(). Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20200924115235.5111-1-linmiaohe@huawei.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:19 UTC
e755f4a mm/workingset.c: fix some doc warnings Fix following warnings caused by mismatch bewteen function parameters and comments. mm/workingset.c:228: warning: Function parameter or member 'lruvec' not described in 'workingset_age_nonresident' mm/workingset.c:228: warning: Excess function parameter 'memcg' description in 'workingset_age_nonresident' Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/1600485913-11192-1-git-send-email-tanxiaofei@huawei.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:19 UTC
70b6d25 mm: fix some comments formatting Correct one function name "get_partials" with "get_partial". Update the old struct name of list3 with kmem_cache_node. Signed-off-by: Chen Tao <chentao3@hotmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lkml.kernel.org/r/Message-ID: Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:19 UTC
0e9aa67 mm: fix some broken comments Fix some broken comments including typo, grammar error and wrong function name. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/20200913095456.54873-1-linmiaohe@huawei.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:19 UTC
ed01737 mm: use self-explanatory macros rather than "2" Signed-off-by: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alex Shi <alex.shi@linux.alibaba.com> Link: http://lkml.kernel.org/r/20200831175042.3527153-2-yuzhao@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:19 UTC
955cc77 mm/highmem.c: clean up endif comments The #endif at the end of the file matches up with the '#if defined(HASHED_PAGE_VIRTUAL)' on line 374. Not the CONFIG_HIGHMEM #if earlier. Fix comments on both of the #endif's to indicate the correct end of blocks for each. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Link: https://lkml.kernel.org/r/20200819184635.112579-1-ira.weiny@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:18 UTC
58f6f03 mm/page_reporting.c: drop stale list head check in page_reporting_cycle list_for_each_entry_safe() guarantees that we will never stumble over the list head; "&page->lru != list" will always evaluate to true. Let's simplify. [david@redhat.com: Changelog refinements] Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Link: http://lkml.kernel.org/r/20200818084448.33969-1-richard.weiyang@linux.alibaba.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:18 UTC
c7df08f mm/slab.h: remove duplicate include Remove duplicate header which is included twice. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Pekka Enberg <penberg@kernel.org> Link: http://lkml.kernel.org/r/20200818114323.58156-1-yuehaibing@huawei.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:18 UTC
4e79603 zram: failing to decompress is WARN_ON worthy If we fail to decompress in zram it's a pretty serious problem. We were entrusted to be able to decompress the old data but we failed. Either we've got some crazy bug in the compression code or we've got memory corruption. At the moment, when this happens the log looks like this: ERR kernel: [ 1833.099861] zram: Decompression failed! err=-22, page=336112 ERR kernel: [ 1833.099881] zram: Decompression failed! err=-22, page=336112 ALERT kernel: [ 1833.099886] Read-error on swap-device (253:0:2688896) It is true that we have an "ALERT" level log in there, but (at least to me) it feels like even this isn't enough to impart the seriousness of this error. Let's convert to a WARN_ON. Note that WARN_ON is automatically "unlikely" so we can simply replace the old annotation with the new one. Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Sonny Rao <sonnyrao@chromium.org> Cc: Jens Axboe <axboe@kernel.dk> Link: https://lkml.kernel.org/r/20200917174059.1.If09c882545dbe432268f7a67a4d4cfcb6caace4f@changeid Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:18 UTC
b86c5fc mm/memory_hotplug: update comment regarding zone shuffling As we no longer shuffle via generic_online_page() and when undoing isolation, we can simplify the comment. We now effectively shuffle only once (properly) when onlining new memory. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Mike Rapoport <rppt@kernel.org> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Scott Cheloha <cheloha@linux.ibm.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Link: https://lkml.kernel.org/r/20201005121534.15649-6-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:18 UTC
7fef431 mm/page_alloc: place pages to tail in __free_pages_core() __free_pages_core() is used when exposing fresh memory to the buddy during system boot and when onlining memory in generic_online_page(). generic_online_page() is used in two cases: 1. Direct memory onlining in online_pages(). 2. Deferred memory onlining in memory-ballooning-like mechanisms (HyperV balloon and virtio-mem), when parts of a section are kept fake-offline to be fake-onlined later on. In 1, we already place pages to the tail of the freelist. Pages will be freed to MIGRATE_ISOLATE lists first and moved to the tail of the freelists via undo_isolate_page_range(). In 2, we currently don't implement a proper rule. In case of virtio-mem, where we currently always online MAX_ORDER - 1 pages, the pages will be placed to the HEAD of the freelist - undesireable. While the hyper-v balloon calls generic_online_page() with single pages, usually it will call it on successive single pages in a larger block. The pages are fresh, so place them to the tail of the freelist and avoid the PCP. In __free_pages_core(), remove the now superflouos call to set_page_refcounted() and add a comment regarding page initialization and the refcount. Note: In 2. we currently don't shuffle. If ever relevant (page shuffling is usually of limited use in virtualized environments), we might want to shuffle after a sequence of generic_online_page() calls in the relevant callers. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@kernel.org> Cc: Scott Cheloha <cheloha@linux.ibm.com> Link: https://lkml.kernel.org/r/20201005121534.15649-5-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:18 UTC
293ffa5 mm/page_alloc: move pages to tail in move_to_free_list() Whenever we move pages between freelists via move_to_free_list()/ move_freepages_block(), we don't actually touch the pages: 1. Page isolation doesn't actually touch the pages, it simply isolates pageblocks and moves all free pages to the MIGRATE_ISOLATE freelist. When undoing isolation, we move the pages back to the target list. 2. Page stealing (steal_suitable_fallback()) moves free pages directly between lists without touching them. 3. reserve_highatomic_pageblock()/unreserve_highatomic_pageblock() moves free pages directly between freelists without touching them. We already place pages to the tail of the freelists when undoing isolation via __putback_isolated_page(), let's do it in any case (e.g., if order <= pageblock_order) and document the behavior. To simplify, let's move the pages to the tail for all move_to_free_list()/move_freepages_block() users. In 2., the target list is empty, so there should be no change. In 3., we might observe a change, however, highatomic is more concerned about allocations succeeding than cache hotness - if we ever realize this change degrades a workload, we can special-case this instance and add a proper comment. This change results in all pages getting onlined via online_pages() to be placed to the tail of the freelist. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mike Rapoport <rppt@kernel.org> Cc: Scott Cheloha <cheloha@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Link: https://lkml.kernel.org/r/20201005121534.15649-4-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:18 UTC
47b6a24 mm/page_alloc: place pages to tail in __putback_isolated_page() __putback_isolated_page() already documents that pages will be placed to the tail of the freelist - this is, however, not the case for "order >= MAX_ORDER - 2" (see buddy_merge_likely()) - which should be the case for all existing users. This change affects two users: - free page reporting - page isolation, when undoing the isolation (including memory onlining). This behavior is desirable for pages that haven't really been touched lately, so exactly the two users that don't actually read/write page content, but rather move untouched pages. The new behavior is especially desirable for memory onlining, where we allow allocation of newly onlined pages via undo_isolate_page_range() in online_pages(). Right now, we always place them to the head of the freelist, resulting in undesireable behavior: Assume we add individual memory chunks via add_memory() and online them right away to the NORMAL zone. We create a dependency chain of unmovable allocations e.g., via the memmap. The memmap of the next chunk will be placed onto previous chunks - if the last block cannot get offlined+removed, all dependent ones cannot get offlined+removed. While this can already be observed with individual DIMMs, it's more of an issue for virtio-mem (and I suspect also ppc DLPAR). Document that this should only be used for optimizations, and no code should rely on this behavior for correction (if the order of the freelists ever changes). We won't care about page shuffling: memory onlining already properly shuffles after onlining. free page reporting doesn't care about physically contiguous ranges, and there are already cases where page isolation will simply move (physically close) free pages to (currently) the head of the freelists via move_freepages_block() instead of shuffling. If this becomes ever relevant, we should shuffle the whole zone when undoing isolation of larger ranges, and after free_contig_range(). Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mike Rapoport <rppt@kernel.org> Cc: Scott Cheloha <cheloha@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Link: https://lkml.kernel.org/r/20201005121534.15649-3-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:18 UTC
f04a5d5 mm/page_alloc: convert "report" flag of __free_one_page() to a proper flag Patch series "mm: place pages to the freelist tail when onlining and undoing isolation", v2. When adding separate memory blocks via add_memory*() and onlining them immediately, the metadata (especially the memmap) of the next block will be placed onto one of the just added+onlined block. This creates a chain of unmovable allocations: If the last memory block cannot get offlined+removed() so will all dependent ones. We directly have unmovable allocations all over the place. This can be observed quite easily using virtio-mem, however, it can also be observed when using DIMMs. The freshly onlined pages will usually be placed to the head of the freelists, meaning they will be allocated next, turning the just-added memory usually immediately un-removable. The fresh pages are cold, prefering to allocate others (that might be hot) also feels to be the natural thing to do. It also applies to the hyper-v balloon xen-balloon, and ppc64 dlpar: when adding separate, successive memory blocks, each memory block will have unmovable allocations on them - for example gigantic pages will fail to allocate. While the ZONE_NORMAL doesn't provide any guarantees that memory can get offlined+removed again (any kind of fragmentation with unmovable allocations is possible), there are many scenarios (hotplugging a lot of memory, running workload, hotunplug some memory/as much as possible) where we can offline+remove quite a lot with this patchset. a) To visualize the problem, a very simple example: Start a VM with 4GB and 8GB of virtio-mem memory: [root@localhost ~]# lsmem RANGE SIZE STATE REMOVABLE BLOCK 0x0000000000000000-0x00000000bfffffff 3G online yes 0-23 0x0000000100000000-0x000000033fffffff 9G online yes 32-103 Memory block size: 128M Total online memory: 12G Total offline memory: 0B Then try to unplug as much as possible using virtio-mem. Observe which memory blocks are still around. Without this patch set: [root@localhost ~]# lsmem RANGE SIZE STATE REMOVABLE BLOCK 0x0000000000000000-0x00000000bfffffff 3G online yes 0-23 0x0000000100000000-0x000000013fffffff 1G online yes 32-39 0x0000000148000000-0x000000014fffffff 128M online yes 41 0x0000000158000000-0x000000015fffffff 128M online yes 43 0x0000000168000000-0x000000016fffffff 128M online yes 45 0x0000000178000000-0x000000017fffffff 128M online yes 47 0x0000000188000000-0x0000000197ffffff 256M online yes 49-50 0x00000001a0000000-0x00000001a7ffffff 128M online yes 52 0x00000001b0000000-0x00000001b7ffffff 128M online yes 54 0x00000001c0000000-0x00000001c7ffffff 128M online yes 56 0x00000001d0000000-0x00000001d7ffffff 128M online yes 58 0x00000001e0000000-0x00000001e7ffffff 128M online yes 60 0x00000001f0000000-0x00000001f7ffffff 128M online yes 62 0x0000000200000000-0x0000000207ffffff 128M online yes 64 0x0000000210000000-0x0000000217ffffff 128M online yes 66 0x0000000220000000-0x0000000227ffffff 128M online yes 68 0x0000000230000000-0x0000000237ffffff 128M online yes 70 0x0000000240000000-0x0000000247ffffff 128M online yes 72 0x0000000250000000-0x0000000257ffffff 128M online yes 74 0x0000000260000000-0x0000000267ffffff 128M online yes 76 0x0000000270000000-0x0000000277ffffff 128M online yes 78 0x0000000280000000-0x0000000287ffffff 128M online yes 80 0x0000000290000000-0x0000000297ffffff 128M online yes 82 0x00000002a0000000-0x00000002a7ffffff 128M online yes 84 0x00000002b0000000-0x00000002b7ffffff 128M online yes 86 0x00000002c0000000-0x00000002c7ffffff 128M online yes 88 0x00000002d0000000-0x00000002d7ffffff 128M online yes 90 0x00000002e0000000-0x00000002e7ffffff 128M online yes 92 0x00000002f0000000-0x00000002f7ffffff 128M online yes 94 0x0000000300000000-0x0000000307ffffff 128M online yes 96 0x0000000310000000-0x0000000317ffffff 128M online yes 98 0x0000000320000000-0x0000000327ffffff 128M online yes 100 0x0000000330000000-0x000000033fffffff 256M online yes 102-103 Memory block size: 128M Total online memory: 8.1G Total offline memory: 0B With this patch set: [root@localhost ~]# lsmem RANGE SIZE STATE REMOVABLE BLOCK 0x0000000000000000-0x00000000bfffffff 3G online yes 0-23 0x0000000100000000-0x000000013fffffff 1G online yes 32-39 Memory block size: 128M Total online memory: 4G Total offline memory: 0B All memory can get unplugged, all memory block can get removed. Of course, no workload ran and the system was basically idle, but it highlights the issue - the fairly deterministic chain of unmovable allocations. When a huge page for the 2MB memmap is needed, a just-onlined 4MB page will be split. The remaining 2MB page will be used for the memmap of the next memory block. So one memory block will hold the memmap of the two following memory blocks. Finally the pages of the last-onlined memory block will get used for the next bigger allocations - if any allocation is unmovable, all dependent memory blocks cannot get unplugged and removed until that allocation is gone. Note that with bigger memory blocks (e.g., 256MB), *all* memory blocks are dependent and none can get unplugged again! b) Experiment with memory intensive workload I performed an experiment with an older version of this patch set (before we used undo_isolate_page_range() in online_pages(): Hotplug 56GB to a VM with an initial 4GB, onlining all memory to ZONE_NORMAL right from the kernel when adding it. I then run various memory intensive workloads that consume most system memory for a total of 45 minutes. Once finished, I try to unplug as much memory as possible. With this change, I am able to remove via virtio-mem (adding individual 128MB memory blocks) 413 out of 448 added memory blocks. Via individual (256MB) DIMMs 380 out of 448 added memory blocks. (I don't have any numbers without this patchset, but looking at the above example, it's at most half of the 448 memory blocks for virtio-mem, and most probably none for DIMMs). Again, there are workloads that might behave very differently due to the nature of ZONE_NORMAL. This change also affects (besides memory onlining): - Other users of undo_isolate_page_range(): Pages are always placed to the tail. -- When memory offlining fails -- When memory isolation fails after having isolated some pageblocks -- When alloc_contig_range() either succeeds or fails - Other users of __putback_isolated_page(): Pages are always placed to the tail. -- Free page reporting - Other users of __free_pages_core() -- AFAIKs, any memory that is getting exposed to the buddy during boot. IIUC we will now usually allocate memory from lower addresses within a zone first (especially during boot). - Other users of generic_online_page() -- Hyper-V balloon This patch (of 5): Let's prepare for additional flags and avoid long parameter lists of bools. Follow-up patches will also make use of the flags in __free_pages_ok(). Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Scott Cheloha <cheloha@linux.ibm.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Michal Hocko <mhocko@kernel.org> Link: https://lkml.kernel.org/r/20201005121534.15649-1-david@redhat.com Link: https://lkml.kernel.org/r/20201005121534.15649-2-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:18 UTC
90c7eae mm: don't panic when links can't be created in sysfs At boot time, or when doing memory hot-add operations, if the links in sysfs can't be created, the system is still able to run, so just report the error in the kernel log rather than BUG_ON and potentially make system unusable because the callpath can be called with locks held. Since the number of memory blocks managed could be high, the messages are rate limited. As a consequence, link_mem_sections() has no status to report anymore. Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: "Rafael J . Wysocki" <rafael@kernel.org> Cc: Scott Cheloha <cheloha@linux.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200915094143.79181-4-ldufour@linux.ibm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:18 UTC
cb8e3c8 kernel/resource: make iomem_resource implicit in release_mem_region_adjustable() "mem" in the name already indicates the root, similar to release_mem_region() and devm_request_mem_region(). Make it implicit. The only single caller always passes iomem_resource, other parents are not applicable. Suggested-by: Wei Yang <richard.weiyang@linux.alibaba.com> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Kees Cook <keescook@chromium.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Baoquan He <bhe@redhat.com> Link: https://lkml.kernel.org/r/20200916073041.10355-1-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:18 UTC
2c76e7f hv_balloon: try to merge system ram resources Let's try to merge system ram resources we add, to minimize the number of resources in /proc/iomem. We don't care about the boundaries of individual chunks we added. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Wei Liu <wei.liu@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jason Wang <jasowang@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Julien Grall <julien@xen.org> Cc: Kees Cook <keescook@chromium.org> Cc: Len Brown <lenb@kernel.org> Cc: Leonardo Bras <leobras.c@gmail.com> Cc: Libor Pechacek <lpechacek@suse.cz> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pingfan Liu <kernelfans@gmail.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Roger Pau Monné <roger.pau@citrix.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Link: https://lkml.kernel.org/r/20200911103459.10306-9-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:18 UTC
1b989d5 xen/balloon: try to merge system ram resources Let's try to merge system ram resources we add, to minimize the number of resources in /proc/iomem. We don't care about the boundaries of individual chunks we added. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Roger Pau Monné <roger.pau@citrix.com> Cc: Julien Grall <julien@xen.org> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jason Wang <jasowang@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Len Brown <lenb@kernel.org> Cc: Leonardo Bras <leobras.c@gmail.com> Cc: Libor Pechacek <lpechacek@suse.cz> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pingfan Liu <kernelfans@gmail.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Wei Liu <wei.liu@kernel.org> Link: https://lkml.kernel.org/r/20200911103459.10306-8-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:18 UTC
9b24247 virtio-mem: try to merge system ram resources virtio-mem adds memory in memory block granularity, to be able to remove it in the same granularity again later, and to grow slowly on demand. This, however, results in quite a lot of resources when adding a lot of memory. Resources are effectively stored in a list-based tree. Having a lot of resources not only wastes memory, it also makes traversing that tree more expensive, and makes /proc/iomem explode in size (e.g., requiring kexec-tools to manually merge resources later when e.g., trying to create a kdump header). Before this patch, we get (/proc/iomem) when hotplugging 2G via virtio-mem on x86-64: [...] 100000000-13fffffff : System RAM 140000000-33fffffff : virtio0 140000000-147ffffff : System RAM (virtio_mem) 148000000-14fffffff : System RAM (virtio_mem) 150000000-157ffffff : System RAM (virtio_mem) 158000000-15fffffff : System RAM (virtio_mem) 160000000-167ffffff : System RAM (virtio_mem) 168000000-16fffffff : System RAM (virtio_mem) 170000000-177ffffff : System RAM (virtio_mem) 178000000-17fffffff : System RAM (virtio_mem) 180000000-187ffffff : System RAM (virtio_mem) 188000000-18fffffff : System RAM (virtio_mem) 190000000-197ffffff : System RAM (virtio_mem) 198000000-19fffffff : System RAM (virtio_mem) 1a0000000-1a7ffffff : System RAM (virtio_mem) 1a8000000-1afffffff : System RAM (virtio_mem) 1b0000000-1b7ffffff : System RAM (virtio_mem) 1b8000000-1bfffffff : System RAM (virtio_mem) 3280000000-32ffffffff : PCI Bus 0000:00 With this patch, we get (/proc/iomem): [...] fffc0000-ffffffff : Reserved 100000000-13fffffff : System RAM 140000000-33fffffff : virtio0 140000000-1bfffffff : System RAM (virtio_mem) 3280000000-32ffffffff : PCI Bus 0000:00 Of course, with more hotplugged memory, it gets worse. When unplugging memory blocks again, try_remove_memory() (via offline_and_remove_memory()) will properly split the resource up again. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Baoquan He <bhe@redhat.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Juergen Gross <jgross@suse.com> Cc: Julien Grall <julien@xen.org> Cc: Kees Cook <keescook@chromium.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Len Brown <lenb@kernel.org> Cc: Leonardo Bras <leobras.c@gmail.com> Cc: Libor Pechacek <lpechacek@suse.cz> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pingfan Liu <kernelfans@gmail.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Roger Pau Monné <roger.pau@citrix.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Wei Liu <wei.liu@kernel.org> Link: https://lkml.kernel.org/r/20200911103459.10306-7-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:18 UTC
9ca6551 mm/memory_hotplug: MEMHP_MERGE_RESOURCE to specify merging of System RAM resources Some add_memory*() users add memory in small, contiguous memory blocks. Examples include virtio-mem, hyper-v balloon, and the XEN balloon. This can quickly result in a lot of memory resources, whereby the actual resource boundaries are not of interest (e.g., it might be relevant for DIMMs, exposed via /proc/iomem to user space). We really want to merge added resources in this scenario where possible. Let's provide a flag (MEMHP_MERGE_RESOURCE) to specify that a resource either created within add_memory*() or passed via add_memory_resource() shall be marked mergeable and merged with applicable siblings. To implement that, we need a kernel/resource interface to mark selected System RAM resources mergeable (IORESOURCE_SYSRAM_MERGEABLE) and trigger merging. Note: We really want to merge after the whole operation succeeded, not directly when adding a resource to the resource tree (it would break add_memory_resource() and require splitting resources again when the operation failed - e.g., due to -ENOMEM). Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Kees Cook <keescook@chromium.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Roger Pau Monné <roger.pau@citrix.com> Cc: Julien Grall <julien@xen.org> Cc: Baoquan He <bhe@redhat.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: Leonardo Bras <leobras.c@gmail.com> Cc: Libor Pechacek <lpechacek@suse.cz> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pingfan Liu <kernelfans@gmail.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Link: https://lkml.kernel.org/r/20200911103459.10306-6-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:18 UTC
b611719 mm/memory_hotplug: prepare passing flags to add_memory() and friends We soon want to pass flags, e.g., to mark added System RAM resources. mergeable. Prepare for that. This patch is based on a similar patch by Oscar Salvador: https://lkml.kernel.org/r/20190625075227.15193-3-osalvador@suse.de Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Juergen Gross <jgross@suse.com> # Xen related part Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Acked-by: Wei Liu <wei.liu@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Baoquan He <bhe@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: David Hildenbrand <david@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Pingfan Liu <kernelfans@gmail.com> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: Libor Pechacek <lpechacek@suse.cz> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Leonardo Bras <leobras.c@gmail.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Julien Grall <julien@xen.org> Cc: Kees Cook <keescook@chromium.org> Cc: Roger Pau Monné <roger.pau@citrix.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Yang <richardw.yang@linux.intel.com> Link: https://lkml.kernel.org/r/20200911103459.10306-5-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:18 UTC
3a0aaef mm/memory_hotplug: guard more declarations by CONFIG_MEMORY_HOTPLUG We soon want to pass flags via a new type to add_memory() and friends. That revealed that we currently don't guard some declarations by CONFIG_MEMORY_HOTPLUG. While some definitions could be moved to different places, let's keep it minimal for now and use CONFIG_MEMORY_HOTPLUG for all functions only compiled with CONFIG_MEMORY_HOTPLUG. Wrap sparse_decode_mem_map() into CONFIG_MEMORY_HOTPLUG, it's only called from CONFIG_MEMORY_HOTPLUG code. While at it, remove allow_online_pfn_range(), which is no longer around, and mhp_notimplemented(), which is unused. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jason Wang <jasowang@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Julien Grall <julien@xen.org> Cc: Kees Cook <keescook@chromium.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Len Brown <lenb@kernel.org> Cc: Leonardo Bras <leobras.c@gmail.com> Cc: Libor Pechacek <lpechacek@suse.cz> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pingfan Liu <kernelfans@gmail.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Roger Pau Monné <roger.pau@citrix.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Wei Liu <wei.liu@kernel.org> Link: https://lkml.kernel.org/r/20200911103459.10306-4-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:18 UTC
7cf603d kernel/resource: move and rename IORESOURCE_MEM_DRIVER_MANAGED IORESOURCE_MEM_DRIVER_MANAGED currently uses an unused PnP bit, which is always set to 0 by hardware. This is far from beautiful (and confusing), and the bit only applies to SYSRAM. So let's move it out of the bus-specific (PnP) defined bits. We'll add another SYSRAM specific bit soon. If we ever need more bits for other purposes, we can steal some from "desc", or reshuffle/regroup what we have. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Kees Cook <keescook@chromium.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Julien Grall <julien@xen.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Len Brown <lenb@kernel.org> Cc: Leonardo Bras <leobras.c@gmail.com> Cc: Libor Pechacek <lpechacek@suse.cz> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pingfan Liu <kernelfans@gmail.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Roger Pau Monné <roger.pau@citrix.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Wei Liu <wei.liu@kernel.org> Link: https://lkml.kernel.org/r/20200911103459.10306-3-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:18 UTC
ec62d04 kernel/resource: make release_mem_region_adjustable() never fail Patch series "selective merging of system ram resources", v4. Some add_memory*() users add memory in small, contiguous memory blocks. Examples include virtio-mem, hyper-v balloon, and the XEN balloon. This can quickly result in a lot of memory resources, whereby the actual resource boundaries are not of interest (e.g., it might be relevant for DIMMs, exposed via /proc/iomem to user space). We really want to merge added resources in this scenario where possible. Resources are effectively stored in a list-based tree. Having a lot of resources not only wastes memory, it also makes traversing that tree more expensive, and makes /proc/iomem explode in size (e.g., requiring kexec-tools to manually merge resources when creating a kdump header. The current kexec-tools resource count limit does not allow for more than ~100GB of memory with a memory block size of 128MB on x86-64). Let's allow to selectively merge system ram resources by specifying a new flag for add_memory*(). Patch #5 contains a /proc/iomem example. Only tested with virtio-mem. This patch (of 8): Let's make sure splitting a resource on memory hotunplug will never fail. This will become more relevant once we merge selected System RAM resources - then, we'll trigger that case more often on memory hotunplug. In general, this function is already unlikely to fail. When we remove memory, we free up quite a lot of metadata (memmap, page tables, memory block device, etc.). The only reason it could really fail would be when injecting allocation errors. All other error cases inside release_mem_region_adjustable() seem to be sanity checks if the function would be abused in different context - let's add WARN_ON_ONCE() in these cases so we can catch them. [natechancellor@gmail.com: fix use of ternary condition in release_mem_region_adjustable] Link: https://lkml.kernel.org/r/20200922060748.2452056-1-natechancellor@gmail.com Link: https://github.com/ClangBuiltLinux/linux/issues/1159 Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Kees Cook <keescook@chromium.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: Wei Yang <richardw.yang@linux.intel.com> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Julien Grall <julien@xen.org> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Len Brown <lenb@kernel.org> Cc: Leonardo Bras <leobras.c@gmail.com> Cc: Libor Pechacek <lpechacek@suse.cz> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: "Oliver O'Halloran" <oohall@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Pingfan Liu <kernelfans@gmail.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Roger Pau Monn <roger.pau@citrix.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Wei Liu <wei.liu@kernel.org> Link: https://lkml.kernel.org/r/20200911103459.10306-2-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:17 UTC
b30c592 mm/memory_hotplug: mark pageblocks MIGRATE_ISOLATE while onlining memory Currently, it can happen that pages are allocated (and freed) via the buddy before we finished basic memory onlining. For example, pages are exposed to the buddy and can be allocated before we actually mark the sections online. Allocated pages could suddenly fail pfn_to_online_page() checks. We had similar issues with pcp handling, when pages are allocated+freed before we reach zone_pcp_update() in online_pages() [1]. Instead, mark all pageblocks MIGRATE_ISOLATE, such that allocations are impossible. Once done with the heavy lifting, use undo_isolate_page_range() to move the pages to the MIGRATE_MOVABLE freelist, marking them ready for allocation. Similar to offline_pages(), we have to manually adjust zone->nr_isolate_pageblock. [1] https://lkml.kernel.org/r/1597150703-19003-1-git-send-email-charante@codeaurora.org Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Charan Teja Reddy <charante@codeaurora.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michel Lespinasse <walken@google.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200819175957.28465-11-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:17 UTC
d882c00 mm: pass migratetype into memmap_init_zone() and move_pfn_range_to_zone() On the memory onlining path, we want to start with MIGRATE_ISOLATE, to un-isolate the pages after memory onlining is complete. Let's allow passing in the migratetype. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michel Lespinasse <walken@google.com> Cc: Charan Teja Reddy <charante@codeaurora.org> Cc: Mel Gorman <mgorman@techsingularity.net> Link: https://lkml.kernel.org/r/20200819175957.28465-10-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:17 UTC
4eb29bd mm/page_alloc: drop stale pageblock comment in memmap_init_zone*() Commit ac5d2539b238 ("mm: meminit: reduce number of times pageblocks are set during struct page init") moved the actual zone range check, leaving only the alignment check for pageblocks. Let's drop the stale comment and make the pageblock check easier to read. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Charan Teja Reddy <charante@codeaurora.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michel Lespinasse <walken@google.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200819175957.28465-9-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:17 UTC
aac6532 mm/memory_hotplug: simplify page onlining We don't allow to offline memory with holes, all boot memory is online, and all hotplugged memory cannot have holes. We can now simplify onlining of pages. As we only allow to online/offline full sections and sections always span full MAX_ORDER_NR_PAGES, we can just process MAX_ORDER - 1 pages without further special handling. The number of onlined pages simply corresponds to the number of pages we were requested to online. While at it, refine the comment regarding the callback not exposing all pages to the buddy. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Charan Teja Reddy <charante@codeaurora.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michel Lespinasse <walken@google.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200819175957.28465-8-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:17 UTC
3fa0c7c mm/page_isolation: simplify return value of start_isolate_page_range() Callers no longer need the number of isolated pageblocks. Let's simplify. Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Charan Teja Reddy <charante@codeaurora.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Logan Gunthorpe <logang@deltatee.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michel Lespinasse <walken@google.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200819175957.28465-7-david@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 16 October 2020, 18:11:17 UTC
back to top