https://github.com/torvalds/linux

sort by:
Revision Author Date Message Commit Date
3ceccb1 rbd: don't assume rbd_is_lock_owner() for exclusive mappings Expanding on the previous commit, assuming that rbd_is_lock_owner() always returns true (i.e. that we are either in RBD_LOCK_STATE_LOCKED or RBD_LOCK_STATE_QUIESCING) if the mapping is exclusive is wrong too. In case ceph_cls_set_cookie() fails, the lock would be temporarily released even if the mapping is exclusive, meaning that we can end up even in RBD_LOCK_STATE_UNLOCKED. IOW, exclusive mappings are really "just" about disabling automatic lock transitions (as documented in the man page), not about grabbing the lock and holding on to it whatever it takes. Cc: stable@vger.kernel.org Fixes: 637cd060537d ("rbd: new exclusive lock wait/wake code") Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn> 25 July 2024, 10:18:29 UTC
2237ceb rbd: don't assume RBD_LOCK_STATE_LOCKED for exclusive mappings Every time a watch is reestablished after getting lost, we need to update the cookie which involves quiescing exclusive lock. For this, we transition from RBD_LOCK_STATE_LOCKED to RBD_LOCK_STATE_QUIESCING roughly for the duration of rbd_reacquire_lock() call. If the mapping is exclusive and I/O happens to arrive in this time window, it's failed with EROFS (later translated to EIO) based on the wrong assumption in rbd_img_exclusive_lock() -- "lock got released?" check there stopped making sense with commit a2b1da09793d ("rbd: lock should be quiesced on reacquire"). To make it worse, any such I/O is added to the acquiring list before EROFS is returned and this sets up for violating rbd_lock_del_request() precondition that the request is either on the running list or not on any list at all -- see commit ded080c86b3f ("rbd: don't move requests to the running list on errors"). rbd_lock_del_request() ends up processing these requests as if they were on the running list which screws up quiescing_wait completion counter and ultimately leads to rbd_assert(!completion_done(&rbd_dev->quiescing_wait)); being triggered on the next watch error. Cc: stable@vger.kernel.org # 06ef84c4e9c4: rbd: rename RBD_LOCK_STATE_RELEASING and releasing_wait Cc: stable@vger.kernel.org Fixes: 637cd060537d ("rbd: new exclusive lock wait/wake code") Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn> 25 July 2024, 10:18:28 UTC
f5c466a rbd: rename RBD_LOCK_STATE_RELEASING and releasing_wait ... to RBD_LOCK_STATE_QUIESCING and quiescing_wait to recognize that this state and the associated completion are backing rbd_quiesce_lock(), which isn't specific to releasing the lock. While exclusive lock does get quiesced before it's released, it also gets quiesced before an attempt to update the cookie is made and there the lock is not released as long as ceph_cls_set_cookie() succeeds. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn> 25 July 2024, 10:18:01 UTC
03230ed ceph: fix incorrect kmalloc size of pagevec mempool The kmalloc size of pagevec mempool is incorrectly calculated. It misses the size of page pointer and only accounts the number for the array. Fixes: a0102bda5bc0 ("ceph: move sb->wb_pagevec_pool to be a global mempool") Signed-off-by: ethanwu <ethanwu@synology.com> Reviewed-by: Xiubo Li <xiubli@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> 23 July 2024, 08:01:57 UTC
578eb54 ceph: periodically flush the cap releases The MDS could be waiting the caps releases infinitely in some corner case and then reporting the caps revoke stuck warning. To fix this we should periodically flush the cap releases. Link: https://tracker.ceph.com/issues/57244 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> 23 July 2024, 08:01:57 UTC
77bb4a5 ceph: convert comma to semicolon in __ceph_dentry_dir_lease_touch() Replace a comma between expression statements by a semicolon. Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Reviewed-by: Xiubo Li <xiubli@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> 23 July 2024, 08:01:57 UTC
65d284a ceph: use cap_wait_list only if debugfs is enabled Only debugfs uses this list. By omitting it, we save some memory and reduce lock contention on `caps_list_lock`. Signed-off-by: Max Kellermann <max.kellermann@ionos.com> Reviewed-by: Xiubo Li <xiubli@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> 23 July 2024, 08:01:57 UTC
0c38364 Linux 6.10 14 July 2024, 22:43:32 UTC
882ddcd Merge tag 'kbuild-fixes-v6.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Make scripts/ld-version.sh robust against the latest LLD - Fix warnings in rpm-pkg with device tree support - Fix warnings in fortify tests with KASAN * tag 'kbuild-fixes-v6.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: fortify: fix warnings in fortify tests with KASAN kbuild: rpm-pkg: avoid the warnings with dtb's listed twice kbuild: Make ld-version.sh more robust against version string changes 14 July 2024, 22:29:35 UTC
84679f0 fortify: fix warnings in fortify tests with KASAN When a software KASAN mode is enabled, the fortify tests emit warnings on some architectures. For example, for ARCH=arm, the combination of CONFIG_FORTIFY_SOURCE=y and CONFIG_KASAN=y produces the following warnings: TEST lib/test_fortify/read_overflow-memchr.log warning: unsafe memchr() usage lacked '__read_overflow' warning in lib/test_fortify/read_overflow-memchr.c TEST lib/test_fortify/read_overflow-memchr_inv.log warning: unsafe memchr_inv() usage lacked '__read_overflow' symbol in lib/test_fortify/read_overflow-memchr_inv.c TEST lib/test_fortify/read_overflow-memcmp.log warning: unsafe memcmp() usage lacked '__read_overflow' warning in lib/test_fortify/read_overflow-memcmp.c TEST lib/test_fortify/read_overflow-memscan.log warning: unsafe memscan() usage lacked '__read_overflow' symbol in lib/test_fortify/read_overflow-memscan.c TEST lib/test_fortify/read_overflow2-memcmp.log warning: unsafe memcmp() usage lacked '__read_overflow2' warning in lib/test_fortify/read_overflow2-memcmp.c [ more and more similar warnings... ] Commit 9c2d1328f88a ("kbuild: provide reasonable defaults for tool coverage") removed KASAN flags from non-kernel objects by default. It was an intended behavior because lib/test_fortify/*.c are unit tests that are not linked to the kernel. As it turns out, some architectures require -fsanitize=kernel-(hw)address to define __SANITIZE_ADDRESS__ for the fortify tests. Without __SANITIZE_ADDRESS__ defined, arch/arm/include/asm/string.h defines __NO_FORTIFY, thus excluding <linux/fortify-string.h>. This issue does not occur on x86 thanks to commit 4ec4190be4cf ("kasan, x86: don't rename memintrinsics in uninstrumented files"), but there are still some architectures that define __NO_FORTIFY in such a situation. Set KASAN_SANITIZE=y explicitly to the fortify tests. Fixes: 9c2d1328f88a ("kbuild: provide reasonable defaults for tool coverage") Reported-by: Arnd Bergmann <arnd@arndb.de> Closes: https://lore.kernel.org/all/0e8dee26-41cc-41ae-9493-10cd1a8e3268@app.fastmail.com/ Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> 14 July 2024, 19:53:49 UTC
e328643 kbuild: rpm-pkg: avoid the warnings with dtb's listed twice After 8d1001f7bdd0 (kbuild: rpm-pkg: fix build error with CONFIG_MODULES=n), the following warning "warning: File listed twice: *.dtb" is appearing for every dtb file that is included. The reason is that the commented commit already adds the folder /lib/modules/%{KERNELRELEASE} in kernel.list file so the folder /lib/modules/%{KERNELRELEASE}/dtb is no longer necessary, just remove it. Fixes: 8d1001f7bdd0 ("kbuild: rpm-pkg: fix build error with CONFIG_MODULES=n") Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> 14 July 2024, 18:13:32 UTC
9852f47 kbuild: Make ld-version.sh more robust against version string changes After [1] in upstream LLVM, ld.lld's version output became slightly different when the cmake configuration option LLVM_APPEND_VC_REV is disabled. Before: Debian LLD 19.0.0 (compatible with GNU linkers) After: Debian LLD 19.0.0, compatible with GNU linkers This results in ld-version.sh failing with scripts/ld-version.sh: 18: arithmetic expression: expecting EOF: "10000 * 19 + 100 * 0 + 0," because the trailing comma is included in the patch level part of the expression. While [1] has been partially reverted in [2] to avoid this breakage (as it impacts the configuration stage and it is present in all LTS branches), it would be good to make ld-version.sh more robust against such miniscule changes like this one. Use POSIX shell parameter expansion [3] to remove the largest suffix after just numbers and periods, replacing of the current removal of everything after a hyphen. ld-version.sh continues to work for a number of distributions (Arch Linux, Debian, and Fedora) and the kernel.org toolchains and no longer errors on a version of ld.lld with [1]. Fixes: 02aff8592204 ("kbuild: check the minimum linker version in Kconfig") Link: https://github.com/llvm/llvm-project/commit/0f9fbbb63cfcd2069441aa2ebef622c9716f8dbb [1] Link: https://github.com/llvm/llvm-project/commit/649cdfc4b6781a350dfc87d9b2a4b5a4c3395909 [2] Link: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html [3] Suggested-by: Fangrui Song <maskray@google.com> Reviewed-by: Fangrui Song <maskray@google.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> 14 July 2024, 18:13:32 UTC
3653469 Merge tag 'sched_urgent_for_v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Borislav Petkov: - Fix a performance regression when measuring the CPU time of a thread (clock_gettime(CLOCK_THREAD_CPUTIME_ID,...)) due to the addition of PSI IRQ time accounting in the hotpath - Fix a task_struct leak due to missing to decrement the refcount when the task is enqueued before the timer which is supposed to do that, expires - Revert an attempt to expedite detaching of movable tasks, as finding those could become very costly. Turns out the original issue wasn't even hit by anyone * tag 'sched_urgent_for_v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Move psi_account_irqtime() out of update_rq_clock_task() hotpath sched/deadline: Fix task_struct reference leak Revert "sched/fair: Make sure to try to detach at least one movable task" 14 July 2024, 17:18:25 UTC
35ce463 Merge tag 'x86_urgent_for_v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Borislav Petkov: - Make sure TF is cleared before calling other functions (BHI mitigation in this case) in the SYSENTER compat handler, as otherwise it will warn about being in single-step mode * tag 'x86_urgent_for_v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/bhi: Avoid warning in #DB handler due to BHI mitigation 14 July 2024, 17:11:20 UTC
4d145e3 Merge tag 'i2c-for-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Fixes for the I2C testunit, the Renesas R-Car driver and some MAINTAINERS corrections" * tag 'i2c-for-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: testunit: avoid re-issued work after read message i2c: rcar: ensure Gen3+ reset does not disturb local targets i2c: mark HostNotify target address as used i2c: testunit: correct Kconfig description MAINTAINERS: VIRTIO I2C loses a maintainer, gains a reviewer MAINTAINERS: delete entries for Thor Thayer i2c: rcar: clear NO_RXDMA flag after resetting i2c: rcar: bring hardware to known state when probing 13 July 2024, 23:34:22 UTC
d0d0cd3 Merge tag '6.10-rc7-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6 Pull smb client fix from Steve French: "Small fix, also for stable" * tag '6.10-rc7-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6: cifs: fix setting SecurityFlags to true 13 July 2024, 20:00:25 UTC
d2346e2 cifs: fix setting SecurityFlags to true If you try to set /proc/fs/cifs/SecurityFlags to 1 it will set them to CIFSSEC_MUST_NTLMV2 which no longer is relevant (the less secure ones like lanman have been removed from cifs.ko) and is also missing some flags (like for signing and encryption) and can even cause mount to fail, so change this to set it to Kerberos in this case. Also change the description of the SecurityFlags to remove mention of flags which are no longer supported. Cc: stable@vger.kernel.org Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> 13 July 2024, 14:24:27 UTC
3fdd2d2 Merge tag 'i2c-host-fixes-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-current This tag includes three fixes for the Renesas R-Car driver: 1. Ensures the device is in a known state after probing. 2. Allows clearing the NO_RXDMA flag after a reset. 3. Forces a reset before any transfer on Gen3+ platforms to prevent disruption of the configuration during parallel transfers. 13 July 2024, 08:50:55 UTC
528dd46 Merge tag 'net-6.10-rc8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull more networking fixes from Jakub Kicinski: "A quick follow up to yesterday's pull. We got a regressions report for the bnxt patch as soon as it got to your tree. The ethtool fix is also good to have, although it's an older regression. Current release - regressions: - eth: bnxt_en: fix crash in bnxt_get_max_rss_ctx_ring() on older HW when user tries to decrease the ring count Previous releases - regressions: - ethtool: fix RSS setting, accept "no change" setting if the driver doesn't support the new features - eth: i40e: remove needless retries of NVM update, don't wait 20min when we know the firmware update won't succeed" * tag 'net-6.10-rc8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: bnxt_en: Fix crash in bnxt_get_max_rss_ctx_ring() octeontx2-af: fix issue with IPv4 match for RSS octeontx2-af: fix issue with IPv6 ext match for RSS octeontx2-af: fix detection of IP layer octeontx2-af: fix a issue with cpt_lf_alloc mailbox octeontx2-af: replace cpt slot with lf id on reg write i40e: fix: remove needless retries of NVM update net: ethtool: Fix RSS setting 13 July 2024, 01:33:33 UTC
f7ce5eb bnxt_en: Fix crash in bnxt_get_max_rss_ctx_ring() On older chips not supporting multiple RSS contexts, reducing ethtool channels will crash: BUG: kernel NULL pointer dereference, address: 00000000000000b8 PGD 0 P4D 0 Oops: Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 7032 Comm: ethtool Tainted: G S 6.10.0-rc4 #1 Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.4.3 01/17/2017 RIP: 0010:bnxt_get_max_rss_ctx_ring+0x4c/0x90 [bnxt_en] Code: c3 d3 eb 4c 8b 83 38 01 00 00 48 8d bb 38 01 00 00 4c 39 c7 74 42 41 8d 54 24 ff 31 c0 0f b7 d2 4c 8d 4c 12 02 66 85 ed 74 1d <49> 8b 90 b8 00 00 00 49 8d 34 11 0f b7 0a 66 39 c8 0f 42 c1 48 83 RSP: 0018:ffffaaa501d23ba8 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff8efdf600c940 RCX: 0000000000000000 RDX: 000000000000007f RSI: ffffffffacf429c4 RDI: ffff8efdf600ca78 RBP: 0000000000000080 R08: 0000000000000000 R09: 0000000000000100 R10: 0000000000000001 R11: ffffaaa501d238c0 R12: 0000000000000080 R13: 0000000000000000 R14: ffff8efdf600c000 R15: 0000000000000006 FS: 00007f977a7d2740(0000) GS:ffff8f041f840000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000b8 CR3: 00000002320aa004 CR4: 00000000003706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? __die_body+0x15/0x60 ? page_fault_oops+0x157/0x440 ? do_user_addr_fault+0x60/0x770 ? _raw_spin_lock_irqsave+0x12/0x40 ? exc_page_fault+0x61/0x120 ? asm_exc_page_fault+0x22/0x30 ? bnxt_get_max_rss_ctx_ring+0x4c/0x90 [bnxt_en] ? bnxt_get_max_rss_ctx_ring+0x25/0x90 [bnxt_en] bnxt_set_channels+0x9d/0x340 [bnxt_en] ethtool_set_channels+0x14b/0x210 __dev_ethtool+0xdf8/0x2890 ? preempt_count_add+0x6a/0xa0 ? percpu_counter_add_batch+0x23/0x90 ? filemap_map_pages+0x417/0x4a0 ? avc_has_extended_perms+0x185/0x420 ? __pfx_udp_ioctl+0x10/0x10 ? sk_ioctl+0x55/0xf0 ? kmalloc_trace_noprof+0xe0/0x210 ? dev_ethtool+0x54/0x170 dev_ethtool+0xa2/0x170 dev_ioctl+0xbe/0x530 sock_do_ioctl+0xa3/0xf0 sock_ioctl+0x20d/0x2e0 bp->rss_ctx_list is not initialized if the chip or firmware does not support multiple RSS contexts. Fix it by adding a check in bnxt_get_max_rss_ctx_ring() before proceeding to reference bp->rss_ctx_list. Fixes: 0d1b7d6c9274 ("bnxt: fix crashes when reducing ring count with active RSS contexts") Reported-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/netdev/ZpFEJeNpwxW1aW9k@gmail.com/ Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20240712175318.166811-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> 13 July 2024, 01:00:00 UTC
975f3b6 Merge tag 'for-6.10-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "Fix a regression in extent map shrinker behaviour. In the past weeks we got reports from users that there are huge latency spikes or freezes. This was bisected to newly added shrinker of extent maps (it was added to fix a build up of the structures in memory). I'm assuming that the freezes would happen to many users after release so I'd like to get it merged now so it's in 6.10. Although the diff size is not small the changes are relatively straightforward, the reporters verified the fixes and we did testing on our side. The fixes: - adjust behaviour under memory pressure and check lock or scheduling conditions, bail out if needed - synchronize tracking of the scanning progress so inode ranges are not skipped or work duplicated - do a delayed iput when scanning a root so evicting an inode does not slow things down in case of lots of dirty data, also fix lockdep warning, a deadlock could happen when writing the dirty data would need to start a transaction" * tag 'for-6.10-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: avoid races when tracking progress for extent map shrinking btrfs: stop extent map shrinker if reschedule is needed btrfs: use delayed iput during extent map shrinking 12 July 2024, 19:08:42 UTC
a52ff90 Merge tag 'ceph-for-6.10-rc8' of https://github.com/ceph/ceph-client Pull ceph fixes from Ilya Dryomov: "A fix for a possible use-after-free following "rbd unmap" or "umount" marked for stable and two kernel-doc fixups" * tag 'ceph-for-6.10-rc8' of https://github.com/ceph/ceph-client: libceph: fix crush_choose_firstn() kernel-doc warnings libceph: suppress crush_choose_indep() kernel-doc warnings libceph: fix race between delayed_work() and ceph_monc_stop() 12 July 2024, 17:39:29 UTC
ac6a9e0 Merge tag 'pmdomain-v6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm Pull pmdomain fix from Ulf Hansson: - qcom: Skip retention level for rpmhpd's * tag 'pmdomain-v6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: pmdomain: qcom: rpmhpd: Skip retention level for Power Domains 12 July 2024, 17:29:49 UTC
01ec3bb Merge tag 'mmc-v6.10-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC host fixes from Ulf Hansson: - davinci_mmc: Prevent transmitted data size from exceeding sgm's length - sdhci: Fix max_seg_size for 64KiB PAGE_SIZE * tag 'mmc-v6.10-rc4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: davinci_mmc: Prevent transmitted data size from exceeding sgm's length mmc: sdhci: Fix max_seg_size for 64KiB PAGE_SIZE 12 July 2024, 17:26:48 UTC
e091caf Merge tag 'arm-fixes-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "Most of these changes are Qualcomm SoC specific and came in just after I sent out the last set of fixes. This includes two regression fixes for SoC drivers, a defconfig change to ensure the Lenovo X13s is usable and 11 changes to DT files to fix regressions and minor platform specific issues. Tony and Chunyan step back from their respective maintainership roles on the omap and unisoc platforms, and Christophe in turn takes over maintaining some of the Freescale SoC drivers that he has been taking care of in practice already. Lastly, there are two trivial fixes for the davinci and sunxi platforms" * tag 'arm-fixes-6.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: MAINTAINERS: Update FREESCALE SOC DRIVERS and QUICC ENGINE LIBRARY MAINTAINERS: Add more maintainers for omaps ARM: davinci: Convert comma to semicolon MAINTAINERS: Move myself from SPRD Maintainer to Reviewer Revert "dt-bindings: cache: qcom,llcc: correct QDU1000 reg entries" arm64: dts: qcom: qdu1000: Fix LLCC reg property arm64: dts: qcom: sm6115: add iommu for sdhc_1 arm64: dts: qcom: x1e80100-crd: fix DAI used for headset recording arm64: dts: qcom: x1e80100-crd: fix WCD audio codec TX port mapping soc: qcom: pmic_glink: disable UCSI on sc8280xp arm64: defconfig: enable Elan i2c-hid driver arm64: dts: qcom: sc8280xp-crd: use external pull up for touch reset arm64: dts: qcom: sc8280xp-x13s: fix touchscreen power on arm64: dts: qcom: x1e80100: Fix PCIe 6a reg offsets and add MHI arm64: dts: qcom: sa8775p: Correct IRQ number of EL2 non-secure physical timer arm64: dts: allwinner: Fix PMIC interrupt number arm64: dts: qcom: sc8280xp: Set status = "reserved" on PSHOLD arm64: dts: qcom: x1e80100-*: Allocate some CMA buffers arm64: dts: qcom: sc8180x: Fix LLCC reg property again 12 July 2024, 16:00:25 UTC
f469cf9 Merge tag 'char-misc-6.10-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char / misc driver fixes from Greg KH: "Here are some small remaining driver fixes for 6.10-final that have all been in linux-next for a while and resolve reported issues. Included in here are: - mei driver fixes (and a spelling fix at the end just to be clean) - iio driver fixes for reported problems - fastrpc bugfixes - nvmem small fixes" * tag 'char-misc-6.10-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: mei: vsc: Fix spelling error mei: vsc: Enhance SPI transfer of IVSC ROM mei: vsc: Utilize the appropriate byte order swap function mei: vsc: Prevent timeout error with added delay post-firmware download mei: vsc: Enhance IVSC chipset stability during warm reboot nvmem: core: limit cell sysfs permissions to main attribute ones nvmem: core: only change name to fram for current attribute nvmem: meson-efuse: Fix return value of nvmem callbacks nvmem: rmem: Fix return value of rmem_read() misc: microchip: pci1xxxx: Fix return value of nvmem callbacks hpet: Support 32-bit userspace misc: fastrpc: Restrict untrusted app to attach to privileged PD misc: fastrpc: Fix ownership reassignment of remote heap misc: fastrpc: Fix memory leak in audio daemon attach operation misc: fastrpc: Avoid updating PD type for capability request misc: fastrpc: Copy the complete capability structure to user misc: fastrpc: Fix DSP capabilities request iio: light: apds9306: Fix error handing iio: trigger: Fix condition for own trigger 12 July 2024, 15:45:27 UTC
1cb67bc Merge tag 'tty-6.10-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty / serial fixes from Greg KH: "Here are some small serial driver fixes for 6.10-final. Included in here are: - qcom-geni fixes for a much much much discussed issue and everyone now seems to be agreed that this is the proper way forward to resolve the reported lockups - imx serial driver bugfixes - 8250_omap errata fix - ma35d1 serial driver bugfix All of these have been in linux-next for over a week with no reported issues" * tag 'tty-6.10-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial: qcom-geni: do not kill the machine on fifo underrun serial: qcom-geni: fix hard lockup on buffer flush serial: qcom-geni: fix soft lockup on sw flow control and suspend serial: imx: ensure RTS signal is not left active after shutdown tty: serial: ma35d1: Add a NULL check for of_node serial: 8250_omap: Fix Errata i2310 with RX FIFO level check serial: imx: only set receiver level if it is zero 12 July 2024, 15:39:44 UTC
1293147 Merge tag 'usb-6.10-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some small USB driver fixes and new device ids for 6.10-final. Included in here are: - new usb-serial device ids for reported devices - syzbot-triggered duplicate endpoint bugfix - gadget bugfix for configfs memory overwrite - xhci resume bugfix - new device quirk added - usb core error path bugfix All of these have been in linux-next (most for a while) with no reported issues" * tag 'usb-6.10-final' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: serial: mos7840: fix crash on resume USB: serial: option: add Rolling RW350-GL variants USB: serial: option: add support for Foxconn T99W651 USB: serial: option: add Netprisma LCUK54 series modules usb: gadget: configfs: Prevent OOB read/write in usb_string_copy() usb: dwc3: pci: add support for the Intel Panther Lake usb: core: add missing of_node_put() in usb_of_has_devices_or_graph USB: Add USB_QUIRK_NO_SET_INTF quirk for START BP-850k USB: core: Fix duplicate endpoint bug by clearing reserved bits in the descriptor xhci: always resume roothubs if xHC was reset during resume USB: serial: option: add Telit generic core-dump composition USB: serial: option: add Fibocom FM350-GL USB: serial: option: add Telit FN912 rmnet compositions 12 July 2024, 15:35:56 UTC
9b48104 Merge tag 'sound-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "The majority of changes here are small device-specific fixes for ASoC SOF / Intel and usual HD-audio quirks. The only significant high LOC is found in the Cirrus firmware driver, but all those are for hardening against malicious firmware blobs, and they look fine for taking as a last minute fix, too" * tag 'sound-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/realtek: Enable Mute LED on HP 250 G7 firmware: cs_dsp: Use strnlen() on name fields in V1 wmfw files ALSA: hda/realtek: Limit mic boost on VAIO PRO PX ALSA: hda: cs35l41: Fix swapped l/r audio channels for Lenovo ThinBook 13x Gen4 ASoC: SOF: Intel: hda-pcm: Limit the maximum number of periods by MAX_BDL_ENTRIES ASoC: rt711-sdw: add missing readable registers ASoC: SOF: Intel: hda: fix null deref on system suspend entry ALSA: hda/realtek: add quirk for Clevo V5[46]0TU firmware: cs_dsp: Prevent buffer overrun when processing V2 alg headers firmware: cs_dsp: Validate payload length before processing block firmware: cs_dsp: Return error if block header overflows file firmware: cs_dsp: Fix overflow checking of wmfw header 12 July 2024, 15:32:40 UTC
5d4c851 Merge tag 'bcachefs-2024-07-12' of https://evilpiepirate.org/git/bcachefs Pull more bcachefs fixes from Kent Overstreet: - revert the SLAB_ACCOUNT patch, something crazy is going on in memcg and someone forgot to test - minor fixes: missing rcu_read_lock(), scheduling while atomic (in an emergency shutdown path) - two lockdep fixes; these could have gone earlier, but were left to bake awhile * tag 'bcachefs-2024-07-12' of https://evilpiepirate.org/git/bcachefs: bcachefs: bch2_gc_btree() should not use btree_root_lock bcachefs: Set PF_MEMALLOC_NOFS when trans->locked bcachefs; Use trans_unlock_long() when waiting on allocator Revert "bcachefs: Mark bch_inode_info as SLAB_ACCOUNT" bcachefs: fix scheduling while atomic in break_cycle() bcachefs: Fix RCU splat 12 July 2024, 15:22:43 UTC
425652d Merge branch 'octeontx2-cpt-rss-cfg-fixes' into main Srujana Challa says: ==================== Fixes for CPT and RSS configuration This series of patches fixes various issues related to CPT configuration and RSS configuration. v1->v2: - Excluded the patch "octeontx2-af: reduce cpt flt interrupt vectors for cn10kb" to submit it to net-next. - Addressed the review comments. Kiran Kumar K (1): octeontx2-af: Fix issue with IPv6 ext match for RSS ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 12 July 2024, 12:42:02 UTC
60795bb octeontx2-af: fix issue with IPv4 match for RSS While performing RSS based on IPv4, packets with IPv4 options are not being considered. Adding changes to match both plain IPv4 and IPv4 with option header. Fixes: 41a7aa7b800d ("octeontx2-af: NIX Rx flowkey configuration for RSS") Signed-off-by: Satheesh Paul <psatheesh@marvell.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> 12 July 2024, 12:42:01 UTC
e23ac10 octeontx2-af: fix issue with IPv6 ext match for RSS While performing RSS based on IPv6, extension ltype is not being considered. This will be problem for fragmented packets or packets with extension header. Adding changes to match IPv6 ext header along with IPv6 ltype. Fixes: 41a7aa7b800d ("octeontx2-af: NIX Rx flowkey configuration for RSS") Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> 12 July 2024, 12:42:01 UTC
404dc0f octeontx2-af: fix detection of IP layer Checksum and length checks are not enabled for IPv4 header with options and IPv6 with extension headers. To fix this a change in enum npc_kpu_lc_ltype is required which will allow adjustment of LTYPE_MASK to detect all types of IP headers. Fixes: 21e6699e5cd6 ("octeontx2-af: Add NPC KPU profile") Signed-off-by: Michal Mazur <mmazur2@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> 12 July 2024, 12:42:00 UTC
845fe19 octeontx2-af: fix a issue with cpt_lf_alloc mailbox This patch fixes CPT_LF_ALLOC mailbox error due to incompatible mailbox message format. Specifically, it corrects the `blkaddr` field type from `int` to `u8`. Fixes: de2854c87c64 ("octeontx2-af: Mailbox changes for 98xx CPT block") Signed-off-by: Srujana Challa <schalla@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> 12 July 2024, 12:42:00 UTC
bc35e28 octeontx2-af: replace cpt slot with lf id on reg write Replace slot id with global CPT lf id on reg read/write as CPTPF/VF driver would send slot number instead of global lf id in the reg offset. And also update the mailbox response with the global lf's register offset. Fixes: ae454086e3c2 ("octeontx2-af: add mailbox interface for CPT") Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> 12 July 2024, 12:42:00 UTC
6fba5cb MAINTAINERS: Update FREESCALE SOC DRIVERS and QUICC ENGINE LIBRARY FREESCALE SOC DRIVERS has been orphaned since commit eaac25d026a1 ("MAINTAINERS: Drop Li Yang as their email address stopped working") QUICC ENGINE LIBRARY has Qiang Zhao as maintainer but he hasn't responded for years and when Li Yang was still maintaining FREESCALE SOC DRIVERS he was also handling QUICC ENGINE LIBRARY directly. As a maintainer of LINUX FOR POWERPC EMBEDDED PPC8XX AND PPC83XX, I also need FREESCALE SOC DRIVERS to be actively maintained, so add myself as maintainer of FREESCALE SOC DRIVERS and QUICC ENGINE LIBRARY. See below link for more context. Link: https://lore.kernel.org/linuxppc-dev/20240219153016.ntltc76bphwrv6hn@skbuf/T/#mf6d4a5eef79e8eae7ae0456a2794c01e630a6756 Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Arnd Bergmann <arnd@arndb.de> 12 July 2024, 11:16:09 UTC
dfd168e MAINTAINERS: Add more maintainers for omaps There are many generations of omaps to maintain, and I will be only active as a hobbyist with time permitting. Let's add more maintainers to ensure continued Linux support. TI is interested in maintaining the active SoCs such as am3, am4 and dra7. And the hobbyists are interested in maintaining some of the older devices, mainly based on omap3 and 4 SoCs. Kevin and Roger have agreed to maintain the active TI parts. Both Kevin and Roger have been working on the omap variants for a long time, and have a good understanding of the hardware. Aaro and Andreas have agreed to maintain the community devices. Both Aaro and Andreas have long experience on working with the earlier TI SoCs. While at it, let's also change me to be a reviewer for the omap1, and drop the link to my old omap web page. Signed-off-by: Tony Lindgren <tony@atomide.com> Acked-by: Kevin Hilman <khilman@baylibre.com> Acked-by: Aaro Koskinen <aaro.koskinen@iki.fi> Acked-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> 12 July 2024, 10:07:10 UTC
119736c i2c: testunit: avoid re-issued work after read message The to-be-fixed commit rightfully prevented that the registers will be cleared. However, the index must be cleared. Otherwise a read message will re-issue the last work. Fix it and add a comment describing the situation. Fixes: c422b6a63024 ("i2c: testunit: don't erase registers after STOP") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Andi Shyti <andi.shyti@kernel.org> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> 12 July 2024, 06:52:01 UTC
8b9b59e i40e: fix: remove needless retries of NVM update Remove wrong EIO to EGAIN conversion and pass all errors as is. After commit 230f3d53a547 ("i40e: remove i40e_status"), which should only replace F/W specific error codes with Linux kernel generic, all EIO errors suddenly started to be converted into EAGAIN which leads nvmupdate to retry until it timeouts and sometimes fails after more than 20 minutes in the middle of NVM update, so NVM becomes corrupted. The bug affects users only at the time when they try to update NVM, and only F/W versions that generate errors while nvmupdate. For example, X710DA2 with 0x8000ECB7 F/W is affected, but there are probably more... Command for reproduction is just NVM update: ./nvmupdate64 In the log instead of: i40e_nvmupd_exec_aq err I40E_ERR_ADMIN_QUEUE_ERROR aq_err I40E_AQ_RC_ENOMEM) appears: i40e_nvmupd_exec_aq err -EIO aq_err I40E_AQ_RC_ENOMEM i40e: eeprom check failed (-5), Tx/Rx traffic disabled The problematic code did silently convert EIO into EAGAIN which forced nvmupdate to ignore EAGAIN error and retry the same operation until timeout. That's why NVM update takes 20+ minutes to finish with the fail in the end. Fixes: 230f3d53a547 ("i40e: remove i40e_status") Co-developed-by: Kelvin Kang <kelvin.kang@intel.com> Signed-off-by: Kelvin Kang <kelvin.kang@intel.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20240710224455.188502-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> 12 July 2024, 00:31:52 UTC
503757c net: ethtool: Fix RSS setting When user submits a rxfh set command without touching XFRM_SYM_XOR, rxfh.input_xfrm is set to RXH_XFRM_NO_CHANGE, which is equal to 0xff. Testing if (rxfh.input_xfrm & RXH_XFRM_SYM_XOR && !ops->cap_rss_sym_xor_supported) return -EOPNOTSUPP; Will always be true on devices that don't set cap_rss_sym_xor_supported, since rxfh.input_xfrm & RXH_XFRM_SYM_XOR is always true, if input_xfrm was not set, i.e RXH_XFRM_NO_CHANGE=0xff, which will result in failure of any command that doesn't require any change of XFRM, e.g RSS context or hash function changes. To avoid this breakage, test if rxfh.input_xfrm != RXH_XFRM_NO_CHANGE before testing other conditions. Note that the problem will only trigger with XFRM-aware userspace, old ethtool CLI would continue to work. Fixes: 0dd415d15505 ("net: ethtool: add a NO_CHANGE uAPI for new RXFH's input_xfrm") Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Ahmed Zaki <ahmed.zaki@intel.com> Link: https://patch.msgid.link/20240710225538.43368-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> 12 July 2024, 00:21:08 UTC
1841027 bcachefs: bch2_gc_btree() should not use btree_root_lock btree_root_lock is for the root keys in btree_root, not the pointers to the nodes themselves; this fixes a lock ordering issue between btree_root_lock and btree node locks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> 12 July 2024, 00:10:55 UTC
f236ea4 bcachefs: Set PF_MEMALLOC_NOFS when trans->locked proper lock ordering is: fs_reclaim -> btree node locks Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> 12 July 2024, 00:10:55 UTC
f0f3e51 bcachefs; Use trans_unlock_long() when waiting on allocator not using unlock_long() blocks key cache reclaim, and the allocator may take awhile Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> 12 July 2024, 00:10:55 UTC
aacd897 Revert "bcachefs: Mark bch_inode_info as SLAB_ACCOUNT" This reverts commit 86d81ec5f5f05846c7c6e48ffb964b24cba2e669. This wasn't tested with memcg enabled, it immediately hits a null ptr deref in list_lru_add(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> 12 July 2024, 00:01:38 UTC
ea5ea84 i2c: rcar: ensure Gen3+ reset does not disturb local targets R-Car Gen3+ needs a reset before every controller transfer. That erases configuration of a potentially in parallel running local target instance. To avoid this disruption, avoid controller transfers if a local target is running. Also, disable SMBusHostNotify because it requires being a controller and local target at the same time. Fixes: 3b770017b03a ("i2c: rcar: handle RXDMA HW behaviour on Gen3") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org> 11 July 2024, 23:45:08 UTC
43db1e0 Merge tag 'for-6.10/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fix from Mikulas Patocka: - Fix broken discard for device mapper VDO target * tag 'for-6.10/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm vdo: replace max_discard_sectors with max_hw_discard_sectors 11 July 2024, 22:11:14 UTC
d5cfecf dm vdo: replace max_discard_sectors with max_hw_discard_sectors Commit 4f563a64732d ("block: add a max_user_discard_sectors queue limit") changed block core to set max_discard_sectors to: min(lim->max_hw_discard_sectors, lim->max_user_discard_sectors) Commit 825d8bbd2f32 ("dm: always manage discard support in terms of max_hw_discard_sectors") fixed most dm targetss to deal with this, by replacing max_discard_sectors with max_hw_discard_sectors. Unfortunately, dm-vdo did not get fixed at that time. Fixes: 825d8bbd2f32 ("dm: always manage discard support in terms of max_hw_discard_sectors") Signed-off-by: Bruce Johnston <bjohnsto@redhat.com> Signed-off-by: Matthew Sakai <msakai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> 11 July 2024, 19:24:41 UTC
8a18fda Merge tag 'spi-fix-v6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "This fixes two regressions that have been bubbling along for a large part of this release. One is a revert of the multi mode support for the OMAP SPI controller, this introduced regressions on a number of systems and while there has been progress on fixing those we've not got something that works for everyone yet so let's just drop the change for now. The other is a series of fixes from David Lechner for his recent message optimisation work, this interacted badly with spi-mux which is altogether too clever with recursive use of the bus and creates situations that hadn't been considered. There are also a couple of small driver specific fixes, including one more patch from David for sleep duration calculations in the AXI driver" * tag 'spi-fix-v6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: mux: set ctlr->bits_per_word_mask spi: add defer_optimize_message controller flag spi: don't unoptimize message in spi_async() spi: omap2-mcspi: Revert multi mode support spi: davinci: Unset POWERDOWN bit when releasing resources spi: axi-spi-engine: fix sleep calculation spi: imx: Don't expect DMA for i.MX{25,35,50,51,53} cspi devices 11 July 2024, 19:07:50 UTC
51df8e0 Merge tag 'net-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bpf and netfilter. Current release - regressions: - core: fix rc7's __skb_datagram_iter() regression Current release - new code bugs: - eth: bnxt: fix crashes when reducing ring count with active RSS contexts Previous releases - regressions: - sched: fix UAF when resolving a clash - skmsg: skip zero length skb in sk_msg_recvmsg2 - sunrpc: fix kernel free on connection failure in xs_tcp_setup_socket - tcp: avoid too many retransmit packets - tcp: fix incorrect undo caused by DSACK of TLP retransmit - udp: Set SOCK_RCU_FREE earlier in udp_lib_get_port(). - eth: ks8851: fix deadlock with the SPI chip variant - eth: i40e: fix XDP program unloading while removing the driver Previous releases - always broken: - bpf: - fix too early release of tcx_entry - fail bpf_timer_cancel when callback is being cancelled - bpf: fix order of args in call to bpf_map_kvcalloc - netfilter: nf_tables: prefer nft_chain_validate - ppp: reject claimed-as-LCP but actually malformed packets - wireguard: avoid unaligned 64-bit memory accesses" * tag 'net-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (33 commits) net, sunrpc: Remap EPERM in case of connection failure in xs_tcp_setup_socket net/sched: Fix UAF when resolving a clash net: ks8851: Fix potential TX stall after interface reopen udp: Set SOCK_RCU_FREE earlier in udp_lib_get_port(). netfilter: nf_tables: prefer nft_chain_validate netfilter: nfnetlink_queue: drop bogus WARN_ON ethtool: netlink: do not return SQI value if link is down ppp: reject claimed-as-LCP but actually malformed packets selftests/bpf: Add timer lockup selftest net: ethernet: mtk-star-emac: set mac_managed_pm when probing e1000e: fix force smbus during suspend flow tcp: avoid too many retransmit packets bpf: Defer work in bpf_timer_cancel_and_free bpf: Fail bpf_timer_cancel when callback is being cancelled bpf: fix order of args in call to bpf_map_kvcalloc net: ethernet: lantiq_etop: fix double free in detach i40e: Fix XDP program unloading while removing the driver net: fix rc7's __skb_datagram_iter() net: ks8851: Fix deadlock with the SPI chip variant octeontx2-af: Fix incorrect value output on error path in rvu_check_rsrc_availability() ... 11 July 2024, 16:29:49 UTC
83ab4b4 Merge tag 'vfs-6.10-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "cachefiles: - Export an existing and add a new cachefile helper to be used in filesystems to fix reference count bugs - Use the newly added fscache_ty_get_volume() helper to get a reference count on an fscache_volume to handle volumes that are about to be removed cleanly - After withdrawing a fscache_cache via FSCACHE_CACHE_IS_WITHDRAWN wait for all ongoing cookie lookups to complete and for the object count to reach zero - Propagate errors from vfs_getxattr() to avoid an infinite loop in cachefiles_check_volume_xattr() because it keeps seeing ESTALE - Don't send new requests when an object is dropped by raising CACHEFILES_ONDEMAND_OJBSTATE_DROPPING - Cancel all requests for an object that is about to be dropped - Wait for the ondemand_boject_worker to finish before dropping a cachefiles object to prevent use-after-free - Use cyclic allocation for message ids to better handle id recycling - Add missing lock protection when iterating through the xarray when polling netfs: - Use standard logging helpers for debug logging VFS: - Fix potential use-after-free in file locks during trace_posix_lock_inode(). The tracepoint could fire while another task raced it and freed the lock that was requested to be traced - Only increment the nr_dentry_negative counter for dentries that are present on the superblock LRU. Currently, DCACHE_LRU_LIST list is used to detect this case. However, the flag is also raised in combination with DCACHE_SHRINK_LIST to indicate that dentry->d_lru is used. So checking only DCACHE_LRU_LIST will lead to wrong nr_dentry_negative count. Fix the check to not count dentries that are on a shrink related list Misc: - hfsplus: fix an uninitialized value issue in copy_name - minix: fix minixfs_rename with HIGHMEM. It still uses kunmap() even though we switched it to kmap_local_page() a while ago" * tag 'vfs-6.10-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: minixfs: Fix minixfs_rename with HIGHMEM hfsplus: fix uninit-value in copy_name vfs: don't mod negative dentry count when on shrinker list filelock: fix potential use-after-free in posix_lock_inode cachefiles: add missing lock protection when polling cachefiles: cyclic allocation of msg_id to avoid reuse cachefiles: wait for ondemand_object_worker to finish when dropping object cachefiles: cancel all requests for the object that is being dropped cachefiles: stop sending new request when dropping object cachefiles: propagate errors from vfs_getxattr() to avoid infinite loop cachefiles: fix slab-use-after-free in cachefiles_withdraw_cookie() cachefiles: fix slab-use-after-free in fscache_withdraw_volume() netfs, fscache: export fscache_put_volume() and add fscache_try_get_volume() netfs: Switch debug logging to pr_debug() 11 July 2024, 16:03:28 UTC
16198ee mmc: davinci_mmc: Prevent transmitted data size from exceeding sgm's length No check is done on the size of the data to be transmiited. This causes a kernel panic when this size exceeds the sg_miter's length. Limit the number of transmitted bytes to sgm->length. Cc: stable@vger.kernel.org Fixes: ed01d210fd91 ("mmc: davinci_mmc: Use sg_miter for PIO") Signed-off-by: Bastien Curutchet <bastien.curutchet@bootlin.com> Link: https://lore.kernel.org/r/20240711081838.47256-2-bastien.curutchet@bootlin.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> 11 July 2024, 15:48:54 UTC
63d20a9 mmc: sdhci: Fix max_seg_size for 64KiB PAGE_SIZE blk_queue_max_segment_size() ensured: if (max_size < PAGE_SIZE) max_size = PAGE_SIZE; whereas: blk_validate_limits() makes it an error: if (WARN_ON_ONCE(lim->max_segment_size < PAGE_SIZE)) return -EINVAL; The change from one to the other, exposed sdhci which was setting maximum segment size too low in some circumstances. Fix the maximum segment size when it is too low. Fixes: 616f87661792 ("mmc: pass queue_limits to blk_mq_alloc_disk") Cc: stable@vger.kernel.org Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Link: https://lore.kernel.org/r/20240710180737.142504-1-adrian.hunter@intel.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> 11 July 2024, 15:48:40 UTC
f19e102 Merge tag 'asoc-fix-v6.10-rc7' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v6.10 A few fairly small fixes for ASoC, there's a relatively large set of hardening changes for the cs_dsp firmware file parsing and a couple of other small device specific fixes. 11 July 2024, 15:11:50 UTC
4484940 btrfs: avoid races when tracking progress for extent map shrinking We store the progress (root and inode numbers) of the extent map shrinker in fs_info without any synchronization but we can have multiple tasks calling into the shrinker during memory allocations when there's enough memory pressure for example. This can result in a task A reading fs_info->extent_map_shrinker_last_ino after another task B updates it, and task A reading fs_info->extent_map_shrinker_last_root before task B updates it, making task A see an odd state that isn't necessarily harmful but may make it skip certain inode ranges or do more work than necessary by going over the same inodes again. These unprotected accesses would also trigger warnings from tools like KCSAN. So add a lock to protect access to these progress fields. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> 11 July 2024, 14:50:54 UTC
b3ebb9b btrfs: stop extent map shrinker if reschedule is needed The extent map shrinker can be called in a variety of contexts where we are under memory pressure, and of them is when a task is trying to allocate memory. For this reason the shrinker is typically called with a value of struct shrink_control::nr_to_scan that is much smaller than what we return in the nr_cached_objects callback of struct super_operations (fs/btrfs/super.c:btrfs_nr_cached_objects()), so that the shrinker does not take a long time and cause high latencies. However we can still take a lot of time in the shrinker even for a limited amount of nr_to_scan: 1) When traversing the red black tree that tracks open inodes in a root, as for example with millions of open inodes we get a deep tree which takes time searching for an inode; 2) Iterating over the extent map tree, which is a red black tree, of an inode when doing the rb_next() calls and when removing an extent map from the tree, since often that requires rebalancing the red black tree; 3) When trying to write lock an inode's extent map tree we may wait for a significant amount of time, because there's either another task about to do IO and searching for an extent map in the tree or inserting an extent map in the tree, and we can have thousands or even millions of extent maps for an inode. Furthermore, there can be concurrent calls to the shrinker so the lock might be busy simply because there is already another task shrinking extent maps for the same inode; 4) We often reschedule if we need to, which further increases latency. So improve on this by stopping the extent map shrinking code whenever we need to reschedule and make it skip an inode if we can't immediately lock its extent map tree. Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Reported-by: Andrea Gelmini <andrea.gelmini@gmail.com> Link: https://lore.kernel.org/linux-btrfs/CABXGCsMmmb36ym8hVNGTiU8yfUS_cGvoUmGCcBrGWq9OxTrs+A@mail.gmail.com/ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> 11 July 2024, 14:45:42 UTC
68a3ebd btrfs: use delayed iput during extent map shrinking When putting an inode during extent map shrinking we're doing a standard iput() but that may take a long time in case the inode is dirty and we are doing the final iput that triggers eviction - the VFS will have to wait for writeback before calling the btrfs evict callback (see fs/inode.c:evict()). This slows down the task running the shrinker which may have been triggered while updating some tree for example, meaning locks are held as well as an open transaction handle. Also if the iput() ends up triggering eviction and the inode has no links anymore, then we trigger item truncation which requires flushing delayed items, space reservation to start a transaction and that may trigger the space reclaim task and wait for it, resulting in deadlocks in case the reclaim task needs for example to commit a transaction and the shrinker is being triggered from a path holding a transaction handle. Syzbot reported such a case with the following stack traces: ====================================================== WARNING: possible circular locking dependency detected 6.10.0-rc2-syzkaller-00010-g2ab795141095 #0 Not tainted ------------------------------------------------------ kswapd0/111 is trying to acquire lock: ffff88801eae4610 (sb_internal#3){.+.+}-{0:0}, at: btrfs_commit_inode_delayed_inode+0x110/0x330 fs/btrfs/delayed-inode.c:1275 but task is already holding lock: ffffffff8dd3a9a0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0xa88/0x1970 mm/vmscan.c:6924 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (fs_reclaim){+.+.}-{0:0}: __fs_reclaim_acquire mm/page_alloc.c:3783 [inline] fs_reclaim_acquire+0x102/0x160 mm/page_alloc.c:3797 might_alloc include/linux/sched/mm.h:334 [inline] slab_pre_alloc_hook mm/slub.c:3890 [inline] slab_alloc_node mm/slub.c:3980 [inline] kmem_cache_alloc_lru_noprof+0x58/0x2f0 mm/slub.c:4019 btrfs_alloc_inode+0x118/0xb20 fs/btrfs/inode.c:8411 alloc_inode+0x5d/0x230 fs/inode.c:261 iget5_locked fs/inode.c:1235 [inline] iget5_locked+0x1c9/0x2c0 fs/inode.c:1228 btrfs_iget_locked fs/btrfs/inode.c:5590 [inline] btrfs_iget_path fs/btrfs/inode.c:5607 [inline] btrfs_iget+0xfb/0x230 fs/btrfs/inode.c:5636 create_reloc_inode+0x403/0x820 fs/btrfs/relocation.c:3911 btrfs_relocate_block_group+0x471/0xe60 fs/btrfs/relocation.c:4114 btrfs_relocate_chunk+0x143/0x450 fs/btrfs/volumes.c:3373 __btrfs_balance fs/btrfs/volumes.c:4157 [inline] btrfs_balance+0x211a/0x3f00 fs/btrfs/volumes.c:4534 btrfs_ioctl_balance fs/btrfs/ioctl.c:3675 [inline] btrfs_ioctl+0x12ed/0x8290 fs/btrfs/ioctl.c:4742 __do_compat_sys_ioctl+0x2c3/0x330 fs/ioctl.c:1007 do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline] __do_fast_syscall_32+0x73/0x120 arch/x86/entry/common.c:386 do_fast_syscall_32+0x32/0x80 arch/x86/entry/common.c:411 entry_SYSENTER_compat_after_hwframe+0x84/0x8e -> #2 (btrfs_trans_num_extwriters){++++}-{0:0}: join_transaction+0x164/0xf40 fs/btrfs/transaction.c:315 start_transaction+0x427/0x1a70 fs/btrfs/transaction.c:700 btrfs_rebuild_free_space_tree+0xaa/0x480 fs/btrfs/free-space-tree.c:1323 btrfs_start_pre_rw_mount+0x218/0xf60 fs/btrfs/disk-io.c:2999 open_ctree+0x41ab/0x52e0 fs/btrfs/disk-io.c:3554 btrfs_fill_super fs/btrfs/super.c:946 [inline] btrfs_get_tree_super fs/btrfs/super.c:1863 [inline] btrfs_get_tree+0x11e9/0x1b90 fs/btrfs/super.c:2089 vfs_get_tree+0x8f/0x380 fs/super.c:1780 fc_mount+0x16/0xc0 fs/namespace.c:1125 btrfs_get_tree_subvol fs/btrfs/super.c:2052 [inline] btrfs_get_tree+0xa53/0x1b90 fs/btrfs/super.c:2090 vfs_get_tree+0x8f/0x380 fs/super.c:1780 do_new_mount fs/namespace.c:3352 [inline] path_mount+0x6e1/0x1f10 fs/namespace.c:3679 do_mount fs/namespace.c:3692 [inline] __do_sys_mount fs/namespace.c:3898 [inline] __se_sys_mount fs/namespace.c:3875 [inline] __ia32_sys_mount+0x295/0x320 fs/namespace.c:3875 do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline] __do_fast_syscall_32+0x73/0x120 arch/x86/entry/common.c:386 do_fast_syscall_32+0x32/0x80 arch/x86/entry/common.c:411 entry_SYSENTER_compat_after_hwframe+0x84/0x8e -> #1 (btrfs_trans_num_writers){++++}-{0:0}: join_transaction+0x148/0xf40 fs/btrfs/transaction.c:314 start_transaction+0x427/0x1a70 fs/btrfs/transaction.c:700 btrfs_rebuild_free_space_tree+0xaa/0x480 fs/btrfs/free-space-tree.c:1323 btrfs_start_pre_rw_mount+0x218/0xf60 fs/btrfs/disk-io.c:2999 open_ctree+0x41ab/0x52e0 fs/btrfs/disk-io.c:3554 btrfs_fill_super fs/btrfs/super.c:946 [inline] btrfs_get_tree_super fs/btrfs/super.c:1863 [inline] btrfs_get_tree+0x11e9/0x1b90 fs/btrfs/super.c:2089 vfs_get_tree+0x8f/0x380 fs/super.c:1780 fc_mount+0x16/0xc0 fs/namespace.c:1125 btrfs_get_tree_subvol fs/btrfs/super.c:2052 [inline] btrfs_get_tree+0xa53/0x1b90 fs/btrfs/super.c:2090 vfs_get_tree+0x8f/0x380 fs/super.c:1780 do_new_mount fs/namespace.c:3352 [inline] path_mount+0x6e1/0x1f10 fs/namespace.c:3679 do_mount fs/namespace.c:3692 [inline] __do_sys_mount fs/namespace.c:3898 [inline] __se_sys_mount fs/namespace.c:3875 [inline] __ia32_sys_mount+0x295/0x320 fs/namespace.c:3875 do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline] __do_fast_syscall_32+0x73/0x120 arch/x86/entry/common.c:386 do_fast_syscall_32+0x32/0x80 arch/x86/entry/common.c:411 entry_SYSENTER_compat_after_hwframe+0x84/0x8e -> #0 (sb_internal#3){.+.+}-{0:0}: check_prev_add kernel/locking/lockdep.c:3134 [inline] check_prevs_add kernel/locking/lockdep.c:3253 [inline] validate_chain kernel/locking/lockdep.c:3869 [inline] __lock_acquire+0x2478/0x3b30 kernel/locking/lockdep.c:5137 lock_acquire kernel/locking/lockdep.c:5754 [inline] lock_acquire+0x1b1/0x560 kernel/locking/lockdep.c:5719 percpu_down_read include/linux/percpu-rwsem.h:51 [inline] __sb_start_write include/linux/fs.h:1655 [inline] sb_start_intwrite include/linux/fs.h:1838 [inline] start_transaction+0xbc1/0x1a70 fs/btrfs/transaction.c:694 btrfs_commit_inode_delayed_inode+0x110/0x330 fs/btrfs/delayed-inode.c:1275 btrfs_evict_inode+0x960/0xe80 fs/btrfs/inode.c:5291 evict+0x2ed/0x6c0 fs/inode.c:667 iput_final fs/inode.c:1741 [inline] iput.part.0+0x5a8/0x7f0 fs/inode.c:1767 iput+0x5c/0x80 fs/inode.c:1757 btrfs_scan_root fs/btrfs/extent_map.c:1118 [inline] btrfs_free_extent_maps+0xbd3/0x1320 fs/btrfs/extent_map.c:1189 super_cache_scan+0x409/0x550 fs/super.c:227 do_shrink_slab+0x44f/0x11c0 mm/shrinker.c:435 shrink_slab+0x18a/0x1310 mm/shrinker.c:662 shrink_one+0x493/0x7c0 mm/vmscan.c:4790 shrink_many mm/vmscan.c:4851 [inline] lru_gen_shrink_node+0x89f/0x1750 mm/vmscan.c:4951 shrink_node mm/vmscan.c:5910 [inline] kswapd_shrink_node mm/vmscan.c:6720 [inline] balance_pgdat+0x1105/0x1970 mm/vmscan.c:6911 kswapd+0x5ea/0xbf0 mm/vmscan.c:7180 kthread+0x2c1/0x3a0 kernel/kthread.c:389 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 other info that might help us debug this: Chain exists of: sb_internal#3 --> btrfs_trans_num_extwriters --> fs_reclaim Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(fs_reclaim); lock(btrfs_trans_num_extwriters); lock(fs_reclaim); rlock(sb_internal#3); *** DEADLOCK *** 2 locks held by kswapd0/111: #0: ffffffff8dd3a9a0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0xa88/0x1970 mm/vmscan.c:6924 #1: ffff88801eae40e0 (&type->s_umount_key#62){++++}-{3:3}, at: super_trylock_shared fs/super.c:562 [inline] #1: ffff88801eae40e0 (&type->s_umount_key#62){++++}-{3:3}, at: super_cache_scan+0x96/0x550 fs/super.c:196 stack backtrace: CPU: 0 PID: 111 Comm: kswapd0 Not tainted 6.10.0-rc2-syzkaller-00010-g2ab795141095 #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:114 check_noncircular+0x31a/0x400 kernel/locking/lockdep.c:2187 check_prev_add kernel/locking/lockdep.c:3134 [inline] check_prevs_add kernel/locking/lockdep.c:3253 [inline] validate_chain kernel/locking/lockdep.c:3869 [inline] __lock_acquire+0x2478/0x3b30 kernel/locking/lockdep.c:5137 lock_acquire kernel/locking/lockdep.c:5754 [inline] lock_acquire+0x1b1/0x560 kernel/locking/lockdep.c:5719 percpu_down_read include/linux/percpu-rwsem.h:51 [inline] __sb_start_write include/linux/fs.h:1655 [inline] sb_start_intwrite include/linux/fs.h:1838 [inline] start_transaction+0xbc1/0x1a70 fs/btrfs/transaction.c:694 btrfs_commit_inode_delayed_inode+0x110/0x330 fs/btrfs/delayed-inode.c:1275 btrfs_evict_inode+0x960/0xe80 fs/btrfs/inode.c:5291 evict+0x2ed/0x6c0 fs/inode.c:667 iput_final fs/inode.c:1741 [inline] iput.part.0+0x5a8/0x7f0 fs/inode.c:1767 iput+0x5c/0x80 fs/inode.c:1757 btrfs_scan_root fs/btrfs/extent_map.c:1118 [inline] btrfs_free_extent_maps+0xbd3/0x1320 fs/btrfs/extent_map.c:1189 super_cache_scan+0x409/0x550 fs/super.c:227 do_shrink_slab+0x44f/0x11c0 mm/shrinker.c:435 shrink_slab+0x18a/0x1310 mm/shrinker.c:662 shrink_one+0x493/0x7c0 mm/vmscan.c:4790 shrink_many mm/vmscan.c:4851 [inline] lru_gen_shrink_node+0x89f/0x1750 mm/vmscan.c:4951 shrink_node mm/vmscan.c:5910 [inline] kswapd_shrink_node mm/vmscan.c:6720 [inline] balance_pgdat+0x1105/0x1970 mm/vmscan.c:6911 kswapd+0x5ea/0xbf0 mm/vmscan.c:7180 kthread+0x2c1/0x3a0 kernel/kthread.c:389 ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> So fix this by using btrfs_add_delayed_iput() so that the final iput is delegated to the cleaner kthread. Link: https://lore.kernel.org/linux-btrfs/000000000000892280061a344581@google.com/ Reported-by: syzbot+3dad89b3993a4b275e72@syzkaller.appspotmail.com Fixes: 956a17d9d050 ("btrfs: add a shrinker for extent maps") Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> 11 July 2024, 14:45:18 UTC
359bc01 libceph: fix crush_choose_firstn() kernel-doc warnings Currently, when built with "make W=1", the following warnings are generated: net/ceph/crush/mapper.c:466: warning: Function parameter or struct member 'work' not described in 'crush_choose_firstn' net/ceph/crush/mapper.c:466: warning: Function parameter or struct member 'weight' not described in 'crush_choose_firstn' net/ceph/crush/mapper.c:466: warning: Function parameter or struct member 'weight_max' not described in 'crush_choose_firstn' net/ceph/crush/mapper.c:466: warning: Function parameter or struct member 'choose_args' not described in 'crush_choose_firstn' Update the crush_choose_firstn() kernel-doc to document these parameters. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> 11 July 2024, 14:33:07 UTC
6463c36 libceph: suppress crush_choose_indep() kernel-doc warnings Currently, when built with "make W=1", the following warnings are generated: net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'map' not described in 'crush_choose_indep' net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'work' not described in 'crush_choose_indep' net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'bucket' not described in 'crush_choose_indep' net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'weight' not described in 'crush_choose_indep' net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'weight_max' not described in 'crush_choose_indep' net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'x' not described in 'crush_choose_indep' net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'left' not described in 'crush_choose_indep' net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'numrep' not described in 'crush_choose_indep' net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'type' not described in 'crush_choose_indep' net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'out' not described in 'crush_choose_indep' net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'outpos' not described in 'crush_choose_indep' net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'tries' not described in 'crush_choose_indep' net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'recurse_tries' not described in 'crush_choose_indep' net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'recurse_to_leaf' not described in 'crush_choose_indep' net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'out2' not described in 'crush_choose_indep' net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'parent_r' not described in 'crush_choose_indep' net/ceph/crush/mapper.c:655: warning: Function parameter or struct member 'choose_args' not described in 'crush_choose_indep' These warnings are generated because the prologue comment for crush_choose_indep() uses the kernel-doc prefix, but the actual comment is a very brief description that is not in kernel-doc format. Since this is a static function there is no need to fully document the function, so replace the kernel-doc comment prefix with a standard comment prefix to remove these warnings. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> 11 July 2024, 14:30:53 UTC
d7c199e Merge tag 'nf-24-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following batch contains Netfilter fixes for net: Patch #1 fixes a bogus WARN_ON splat in nfnetlink_queue. Patch #2 fixes a crash due to stack overflow in chain loop detection by using the existing chain validation routines Both patches from Florian Westphal. netfilter pull request 24-07-11 * tag 'nf-24-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: prefer nft_chain_validate netfilter: nfnetlink_queue: drop bogus WARN_ON ==================== Link: https://patch.msgid.link/20240711093948.3816-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> 11 July 2024, 10:57:10 UTC
a819ff0 Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2024-07-11 The following pull-request contains BPF updates for your *net* tree. We've added 4 non-merge commits during the last 2 day(s) which contain a total of 4 files changed, 262 insertions(+), 19 deletions(-). The main changes are: 1) Fixes for a BPF timer lockup and a use-after-free scenario when timers are used concurrently, from Kumar Kartikeya Dwivedi. 2) Fix the argument order in the call to bpf_map_kvcalloc() which could otherwise lead to a compilation error, from Mohammad Shehar Yaar Tausif. bpf-for-netdev * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Add timer lockup selftest bpf: Defer work in bpf_timer_cancel_and_free bpf: Fail bpf_timer_cancel when callback is being cancelled bpf: fix order of args in call to bpf_map_kvcalloc ==================== Link: https://patch.msgid.link/20240711084016.25757-1-daniel@iogearbox.net Signed-off-by: Paolo Abeni <pabeni@redhat.com> 11 July 2024, 10:38:33 UTC
626dfed net, sunrpc: Remap EPERM in case of connection failure in xs_tcp_setup_socket When using a BPF program on kernel_connect(), the call can return -EPERM. This causes xs_tcp_setup_socket() to loop forever, filling up the syslog and causing the kernel to potentially freeze up. Neil suggested: This will propagate -EPERM up into other layers which might not be ready to handle it. It might be safer to map EPERM to an error we would be more likely to expect from the network system - such as ECONNREFUSED or ENETDOWN. ECONNREFUSED as error seems reasonable. For programs setting a different error can be out of reach (see handling in 4fbac77d2d09) in particular on kernels which do not have f10d05966196 ("bpf: Make BPF_PROG_RUN_ARRAY return -err instead of allow boolean"), thus given that it is better to simply remap for consistent behavior. UDP does handle EPERM in xs_udp_send_request(). Fixes: d74bad4e74ee ("bpf: Hooks for sys_connect") Fixes: 4fbac77d2d09 ("bpf: Hooks for sys_bind") Co-developed-by: Lex Siegel <usiegl00@gmail.com> Signed-off-by: Lex Siegel <usiegl00@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Neil Brown <neilb@suse.de> Cc: Trond Myklebust <trondmy@kernel.org> Cc: Anna Schumaker <anna@kernel.org> Link: https://github.com/cilium/cilium/issues/33395 Link: https://lore.kernel.org/bpf/171374175513.12877.8993642908082014881@noble.neil.brown.name Link: https://patch.msgid.link/9069ec1d59e4b2129fc23433349fd5580ad43921.1720075070.git.daniel@iogearbox.net Signed-off-by: Paolo Abeni <pabeni@redhat.com> 11 July 2024, 10:17:45 UTC
2648817 net/sched: Fix UAF when resolving a clash KASAN reports the following UAF: BUG: KASAN: slab-use-after-free in tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct] Read of size 1 at addr ffff888c07603600 by task handler130/6469 Call Trace: <IRQ> dump_stack_lvl+0x48/0x70 print_address_description.constprop.0+0x33/0x3d0 print_report+0xc0/0x2b0 kasan_report+0xd0/0x120 __asan_load1+0x6c/0x80 tcf_ct_flow_table_process_conn+0x12b/0x380 [act_ct] tcf_ct_act+0x886/0x1350 [act_ct] tcf_action_exec+0xf8/0x1f0 fl_classify+0x355/0x360 [cls_flower] __tcf_classify+0x1fd/0x330 tcf_classify+0x21c/0x3c0 sch_handle_ingress.constprop.0+0x2c5/0x500 __netif_receive_skb_core.constprop.0+0xb25/0x1510 __netif_receive_skb_list_core+0x220/0x4c0 netif_receive_skb_list_internal+0x446/0x620 napi_complete_done+0x157/0x3d0 gro_cell_poll+0xcf/0x100 __napi_poll+0x65/0x310 net_rx_action+0x30c/0x5c0 __do_softirq+0x14f/0x491 __irq_exit_rcu+0x82/0xc0 irq_exit_rcu+0xe/0x20 common_interrupt+0xa1/0xb0 </IRQ> <TASK> asm_common_interrupt+0x27/0x40 Allocated by task 6469: kasan_save_stack+0x38/0x70 kasan_set_track+0x25/0x40 kasan_save_alloc_info+0x1e/0x40 __kasan_krealloc+0x133/0x190 krealloc+0xaa/0x130 nf_ct_ext_add+0xed/0x230 [nf_conntrack] tcf_ct_act+0x1095/0x1350 [act_ct] tcf_action_exec+0xf8/0x1f0 fl_classify+0x355/0x360 [cls_flower] __tcf_classify+0x1fd/0x330 tcf_classify+0x21c/0x3c0 sch_handle_ingress.constprop.0+0x2c5/0x500 __netif_receive_skb_core.constprop.0+0xb25/0x1510 __netif_receive_skb_list_core+0x220/0x4c0 netif_receive_skb_list_internal+0x446/0x620 napi_complete_done+0x157/0x3d0 gro_cell_poll+0xcf/0x100 __napi_poll+0x65/0x310 net_rx_action+0x30c/0x5c0 __do_softirq+0x14f/0x491 Freed by task 6469: kasan_save_stack+0x38/0x70 kasan_set_track+0x25/0x40 kasan_save_free_info+0x2b/0x60 ____kasan_slab_free+0x180/0x1f0 __kasan_slab_free+0x12/0x30 slab_free_freelist_hook+0xd2/0x1a0 __kmem_cache_free+0x1a2/0x2f0 kfree+0x78/0x120 nf_conntrack_free+0x74/0x130 [nf_conntrack] nf_ct_destroy+0xb2/0x140 [nf_conntrack] __nf_ct_resolve_clash+0x529/0x5d0 [nf_conntrack] nf_ct_resolve_clash+0xf6/0x490 [nf_conntrack] __nf_conntrack_confirm+0x2c6/0x770 [nf_conntrack] tcf_ct_act+0x12ad/0x1350 [act_ct] tcf_action_exec+0xf8/0x1f0 fl_classify+0x355/0x360 [cls_flower] __tcf_classify+0x1fd/0x330 tcf_classify+0x21c/0x3c0 sch_handle_ingress.constprop.0+0x2c5/0x500 __netif_receive_skb_core.constprop.0+0xb25/0x1510 __netif_receive_skb_list_core+0x220/0x4c0 netif_receive_skb_list_internal+0x446/0x620 napi_complete_done+0x157/0x3d0 gro_cell_poll+0xcf/0x100 __napi_poll+0x65/0x310 net_rx_action+0x30c/0x5c0 __do_softirq+0x14f/0x491 The ct may be dropped if a clash has been resolved but is still passed to the tcf_ct_flow_table_process_conn function for further usage. This issue can be fixed by retrieving ct from skb again after confirming conntrack. Fixes: 0cc254e5aa37 ("net/sched: act_ct: Offload connections with commit action") Co-developed-by: Gerald Yang <gerald.yang@canonical.com> Signed-off-by: Gerald Yang <gerald.yang@canonical.com> Signed-off-by: Chengen Du <chengen.du@canonical.com> Link: https://patch.msgid.link/20240710053747.13223-1-chengen.du@canonical.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> 11 July 2024, 10:07:54 UTC
7a99afe net: ks8851: Fix potential TX stall after interface reopen The amount of TX space in the hardware buffer is tracked in the tx_space variable. The initial value is currently only set during driver probing. After closing the interface and reopening it the tx_space variable has the last value it had before close. If it is smaller than the size of the first send packet after reopeing the interface the queue will be stopped. The queue is woken up after receiving a TX interrupt but this will never happen since we did not send anything. This commit moves the initialization of the tx_space variable to the ks8851_net_open function right before starting the TX queue. Also query the value from the hardware instead of using a hard coded value. Only the SPI chip variant is affected by this issue because only this driver variant actually depends on the tx_space variable in the xmit function. Fixes: 3dc5d4454545 ("net: ks8851: Fix TX stall caused by TX buffer overrun") Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org # 5.10+ Signed-off-by: Ronald Wahl <ronald.wahl@raritan.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20240709195845.9089-1-rwahl@gmx.de Signed-off-by: Paolo Abeni <pabeni@redhat.com> 11 July 2024, 09:52:29 UTC
5c0b485 udp: Set SOCK_RCU_FREE earlier in udp_lib_get_port(). syzkaller triggered the warning [0] in udp_v4_early_demux(). In udp_v[46]_early_demux() and sk_lookup(), we do not touch the refcount of the looked-up sk and use sock_pfree() as skb->destructor, so we check SOCK_RCU_FREE to ensure that the sk is safe to access during the RCU grace period. Currently, SOCK_RCU_FREE is flagged for a bound socket after being put into the hash table. Moreover, the SOCK_RCU_FREE check is done too early in udp_v[46]_early_demux() and sk_lookup(), so there could be a small race window: CPU1 CPU2 ---- ---- udp_v4_early_demux() udp_lib_get_port() | |- hlist_add_head_rcu() |- sk = __udp4_lib_demux_lookup() | |- DEBUG_NET_WARN_ON_ONCE(sk_is_refcounted(sk)); `- sock_set_flag(sk, SOCK_RCU_FREE) We had the same bug in TCP and fixed it in commit 871019b22d1b ("net: set SOCK_RCU_FREE before inserting socket into hashtable"). Let's apply the same fix for UDP. [0]: WARNING: CPU: 0 PID: 11198 at net/ipv4/udp.c:2599 udp_v4_early_demux+0x481/0xb70 net/ipv4/udp.c:2599 Modules linked in: CPU: 0 PID: 11198 Comm: syz-executor.1 Not tainted 6.9.0-g93bda33046e7 #13 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 RIP: 0010:udp_v4_early_demux+0x481/0xb70 net/ipv4/udp.c:2599 Code: c5 7a 15 fe bb 01 00 00 00 44 89 e9 31 ff d3 e3 81 e3 bf ef ff ff 89 de e8 2c 74 15 fe 85 db 0f 85 02 06 00 00 e8 9f 7a 15 fe <0f> 0b e8 98 7a 15 fe 49 8d 7e 60 e8 4f 39 2f fe 49 c7 46 60 20 52 RSP: 0018:ffffc9000ce3fa58 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff8318c92c RDX: ffff888036ccde00 RSI: ffffffff8318c2f1 RDI: 0000000000000001 RBP: ffff88805a2dd6e0 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: 0001ffffffffffff R12: ffff88805a2dd680 R13: 0000000000000007 R14: ffff88800923f900 R15: ffff88805456004e FS: 00007fc449127640(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc449126e38 CR3: 000000003de4b002 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 PKRU: 55555554 Call Trace: <TASK> ip_rcv_finish_core.constprop.0+0xbdd/0xd20 net/ipv4/ip_input.c:349 ip_rcv_finish+0xda/0x150 net/ipv4/ip_input.c:447 NF_HOOK include/linux/netfilter.h:314 [inline] NF_HOOK include/linux/netfilter.h:308 [inline] ip_rcv+0x16c/0x180 net/ipv4/ip_input.c:569 __netif_receive_skb_one_core+0xb3/0xe0 net/core/dev.c:5624 __netif_receive_skb+0x21/0xd0 net/core/dev.c:5738 netif_receive_skb_internal net/core/dev.c:5824 [inline] netif_receive_skb+0x271/0x300 net/core/dev.c:5884 tun_rx_batched drivers/net/tun.c:1549 [inline] tun_get_user+0x24db/0x2c50 drivers/net/tun.c:2002 tun_chr_write_iter+0x107/0x1a0 drivers/net/tun.c:2048 new_sync_write fs/read_write.c:497 [inline] vfs_write+0x76f/0x8d0 fs/read_write.c:590 ksys_write+0xbf/0x190 fs/read_write.c:643 __do_sys_write fs/read_write.c:655 [inline] __se_sys_write fs/read_write.c:652 [inline] __x64_sys_write+0x41/0x50 fs/read_write.c:652 x64_sys_call+0xe66/0x1990 arch/x86/include/generated/asm/syscalls_64.h:2 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0x4b/0x110 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7fc44a68bc1f Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 e9 cf f5 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 3c d0 f5 ff 48 RSP: 002b:00007fc449126c90 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00000000004bc050 RCX: 00007fc44a68bc1f RDX: 0000000000000032 RSI: 00000000200000c0 RDI: 00000000000000c8 RBP: 00000000004bc050 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000032 R11: 0000000000000293 R12: 0000000000000000 R13: 000000000000000b R14: 00007fc44a5ec530 R15: 0000000000000000 </TASK> Fixes: 6acc9b432e67 ("bpf: Add helper to retrieve socket in BPF") Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20240709191356.24010-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> 11 July 2024, 09:28:27 UTC
bd9f534 i2c: mark HostNotify target address as used I2C core handles the local target for receiving HostNotify alerts. There is no separate driver bound to that address. That means userspace can access it if desired, leading to further complications if controllers are not capable of reading their own local target. Bind the local target to the dummy driver so it will be marked as "handled by the kernel" if the HostNotify feature is used. That protects aginst userspace access and prevents other drivers binding to it. Fixes: 2a71593da34d ("i2c: smbus: add core function handling SMBus host-notify") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> 11 July 2024, 09:27:30 UTC
6dfe0ab i2c: testunit: correct Kconfig description The testunit has nothing to do with 'eeprom', remove that term. It was a copy&paste leftover. Fixes: a8335c64c5f0 ("i2c: add slave testunit driver") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> 11 July 2024, 09:27:22 UTC
cff3bd0 netfilter: nf_tables: prefer nft_chain_validate nft_chain_validate already performs loop detection because a cycle will result in a call stack overflow (ctx->level >= NFT_JUMP_STACK_SIZE). It also follows maps via ->validate callback in nft_lookup, so there appears no reason to iterate the maps again. nf_tables_check_loops() and all its helper functions can be removed. This improves ruleset load time significantly, from 23s down to 12s. This also fixes a crash bug. Old loop detection code can result in unbounded recursion: BUG: TASK stack guard page was hit at .... Oops: stack guard page: 0000 [#1] PREEMPT SMP KASAN CPU: 4 PID: 1539 Comm: nft Not tainted 6.10.0-rc5+ #1 [..] with a suitable ruleset during validation of register stores. I can't see any actual reason to attempt to check for this from nft_validate_register_store(), at this point the transaction is still in progress, so we don't have a full picture of the rule graph. For nf-next it might make sense to either remove it or make this depend on table->validate_state in case we could catch an error earlier (for improved error reporting to userspace). Fixes: 20a69341f2d0 ("netfilter: nf_tables: add netlink set API") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> 11 July 2024, 09:26:35 UTC
631a4b3 netfilter: nfnetlink_queue: drop bogus WARN_ON Happens when rules get flushed/deleted while packet is out, so remove this WARN_ON. This WARN exists in one form or another since v4.14, no need to backport this to older releases, hence use a more recent fixes tag. Fixes: 3f8019688894 ("netfilter: move nf_reinject into nfnetlink_queue modules") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202407081453.11ac0f63-lkp@intel.com Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> 11 July 2024, 09:26:33 UTC
0830f97 MAINTAINERS: VIRTIO I2C loses a maintainer, gains a reviewer Conghui Chen left, welcome Jian as reviewer. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: "Chen, Jian Jun" <jian.jun.chen@intel.com> 11 July 2024, 09:19:46 UTC
c184cf9 ethtool: netlink: do not return SQI value if link is down Do not attach SQI value if link is down. "SQI values are only valid if link-up condition is present" per OpenAlliance specification of 100Base-T1 Interoperability Test suite [1]. The same rule would apply for other link types. [1] https://opensig.org/automotive-ethernet-specifications/# Fixes: 806602191592 ("ethtool: provide UAPI for PHY Signal Quality Index (SQI)") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Woojung Huh <woojung.huh@microchip.com> Link: https://patch.msgid.link/20240709061943.729381-1-o.rempel@pengutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com> 11 July 2024, 09:19:07 UTC
f2aeb73 ppp: reject claimed-as-LCP but actually malformed packets Since 'ppp_async_encode()' assumes valid LCP packets (with code from 1 to 7 inclusive), add 'ppp_check_packet()' to ensure that LCP packet has an actual body beyond PPP_LCP header bytes, and reject claimed-as-LCP but actually malformed data otherwise. Reported-by: syzbot+ec0723ba9605678b14bf@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=ec0723ba9605678b14bf Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Signed-off-by: Paolo Abeni <pabeni@redhat.com> 11 July 2024, 09:00:08 UTC
ca8e83a MAINTAINERS: delete entries for Thor Thayer The email address bounced. I couldn't find a newer one in recent git history. Delete the entries and let them fallback to subsystem defaults. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> 11 July 2024, 08:48:42 UTC
50bd5a0 selftests/bpf: Add timer lockup selftest Add a selftest that tries to trigger a situation where two timer callbacks are attempting to cancel each other's timer. By running them continuously, we hit a condition where both run in parallel and cancel each other. Without the fix in the previous patch, this would cause a lockup as hrtimer_cancel on either side will wait for forward progress from the callback. Ensure that this situation leads to a EDEADLK error. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240711052709.2148616-1-memxor@gmail.com 11 July 2024, 08:18:31 UTC
8c6790b net: ethernet: mtk-star-emac: set mac_managed_pm when probing The below commit introduced a warning message when phy state is not in the states: PHY_HALTED, PHY_READY, and PHY_UP. commit 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state") mtk-star-emac doesn't need mdiobus suspend/resume. To fix the warning message during resume, indicate the phy resume/suspend is managed by the mac when probing. Fixes: 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state") Signed-off-by: Jian Hui Lee <jianhui.lee@canonical.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20240708065210.4178980-1-jianhui.lee@canonical.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> 11 July 2024, 08:13:28 UTC
76a0a3f e1000e: fix force smbus during suspend flow Commit 861e8086029e ("e1000e: move force SMBUS from enable ulp function to avoid PHY loss issue") resolved a PHY access loss during suspend on Meteor Lake consumer platforms, but it affected corporate systems incorrectly. A better fix, working for both consumer and corporate systems, was proposed in commit bfd546a552e1 ("e1000e: move force SMBUS near the end of enable_ulp function"). However, it introduced a regression on older devices, such as [8086:15B8], [8086:15F9], [8086:15BE]. This patch aims to fix the secondary regression, by limiting the scope of the changes to Meteor Lake platforms only. Fixes: bfd546a552e1 ("e1000e: move force SMBUS near the end of enable_ulp function") Reported-by: Todd Brandt <todd.e.brandt@intel.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218940 Reported-by: Dieter Mummenschanz <dmummenschanz@web.de> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218936 Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240709203123.2103296-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> 11 July 2024, 02:06:17 UTC
97a9063 tcp: avoid too many retransmit packets If a TCP socket is using TCP_USER_TIMEOUT, and the other peer retracted its window to zero, tcp_retransmit_timer() can retransmit a packet every two jiffies (2 ms for HZ=1000), for about 4 minutes after TCP_USER_TIMEOUT has 'expired'. The fix is to make sure tcp_rtx_probe0_timed_out() takes icsk->icsk_user_timeout into account. Before blamed commit, the socket would not timeout after icsk->icsk_user_timeout, but would use standard exponential backoff for the retransmits. Also worth noting that before commit e89688e3e978 ("net: tcp: fix unexcepted socket die when snd_wnd is 0"), the issue would last 2 minutes instead of 4. Fixes: b701a99e431d ("tcp: Add tcp_clamp_rto_to_user_timeout() helper to improve accuracy") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Jon Maxwell <jmaxwell37@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20240710001402.2758273-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> 11 July 2024, 02:05:27 UTC
0c23734 Merge branch 'fixes-for-bpf-timer-lockup-and-uaf' Kumar Kartikeya Dwivedi says: ==================== Fixes for BPF timer lockup and UAF The following patches contain fixes for timer lockups and a use-after-free scenario. This set proposes to fix the following lockup situation for BPF timers. CPU 1 CPU 2 bpf_timer_cb bpf_timer_cb timer_cb1 timer_cb2 bpf_timer_cancel(timer_cb2) bpf_timer_cancel(timer_cb1) hrtimer_cancel hrtimer_cancel In this case, both callbacks will continue waiting for each other to finish synchronously, causing a lockup. The proposed fix adds support for tracking in-flight cancellations *begun by other timer callbacks* for a particular BPF timer. Whenever preparing to call hrtimer_cancel, a callback will increment the target timer's counter, then inspect its in-flight cancellations, and if non-zero, return -EDEADLK to avoid situations where the target timer's callback is waiting for its completion. This does mean that in cases where a callback is fired and cancelled, it will be unable to cancel any timers in that execution. This can be alleviated by maintaining the list of waiting callbacks in bpf_hrtimer and searching through it to avoid interdependencies, but this may introduce additional delays in bpf_timer_cancel, in addition to requiring extra state at runtime which may need to be allocated or reused from bpf_hrtimer storage. Moreover, extra synchronization is needed to delete these elements from the list of waiting callbacks once hrtimer_cancel has finished. The second patch is for a deadlock situation similar to above in bpf_timer_cancel_and_free, but also a UAF scenario that can occur if timer is armed before entering it, if hrtimer_running check causes the hrtimer_cancel call to be skipped. As seen above, synchronous hrtimer_cancel would lead to deadlock (if same callback tries to free its timer, or two timers free each other), therefore we queue work onto the global workqueue to ensure outstanding timers are cancelled before bpf_hrtimer state is freed. Further details are in the patches. ==================== Link: https://lore.kernel.org/r/20240709185440.1104957-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> 10 July 2024, 23:20:16 UTC
a6fcd19 bpf: Defer work in bpf_timer_cancel_and_free Currently, the same case as previous patch (two timer callbacks trying to cancel each other) can be invoked through bpf_map_update_elem as well, or more precisely, freeing map elements containing timers. Since this relies on hrtimer_cancel as well, it is prone to the same deadlock situation as the previous patch. It would be sufficient to use hrtimer_try_to_cancel to fix this problem, as the timer cannot be enqueued after async_cancel_and_free. Once async_cancel_and_free has been done, the timer must be reinitialized before it can be armed again. The callback running in parallel trying to arm the timer will fail, and freeing bpf_hrtimer without waiting is sufficient (given kfree_rcu), and bpf_timer_cb will return HRTIMER_NORESTART, preventing the timer from being rearmed again. However, there exists a UAF scenario where the callback arms the timer before entering this function, such that if cancellation fails (due to timer callback invoking this routine, or the target timer callback running concurrently). In such a case, if the timer expiration is significantly far in the future, the RCU grace period expiration happening before it will free the bpf_hrtimer state and along with it the struct hrtimer, that is enqueued. Hence, it is clear cancellation needs to occur after async_cancel_and_free, and yet it cannot be done inline due to deadlock issues. We thus modify bpf_timer_cancel_and_free to defer work to the global workqueue, adding a work_struct alongside rcu_head (both used at _different_ points of time, so can share space). Update existing code comments to reflect the new state of affairs. Fixes: b00628b1c7d5 ("bpf: Introduce bpf timers.") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20240709185440.1104957-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> 10 July 2024, 22:59:44 UTC
d452383 bpf: Fail bpf_timer_cancel when callback is being cancelled Given a schedule: timer1 cb timer2 cb bpf_timer_cancel(timer2); bpf_timer_cancel(timer1); Both bpf_timer_cancel calls would wait for the other callback to finish executing, introducing a lockup. Add an atomic_t count named 'cancelling' in bpf_hrtimer. This keeps track of all in-flight cancellation requests for a given BPF timer. Whenever cancelling a BPF timer, we must check if we have outstanding cancellation requests, and if so, we must fail the operation with an error (-EDEADLK) since cancellation is synchronous and waits for the callback to finish executing. This implies that we can enter a deadlock situation involving two or more timer callbacks executing in parallel and attempting to cancel one another. Note that we avoid incrementing the cancelling counter for the target timer (the one being cancelled) if bpf_timer_cancel is not invoked from a callback, to avoid spurious errors. The whole point of detecting cur->cancelling and returning -EDEADLK is to not enter a busy wait loop (which may or may not lead to a lockup). This does not apply in case the caller is in a non-callback context, the other side can continue to cancel as it sees fit without running into errors. Background on prior attempts: Earlier versions of this patch used a bool 'cancelling' bit and used the following pattern under timer->lock to publish cancellation status. lock(t->lock); t->cancelling = true; mb(); if (cur->cancelling) return -EDEADLK; unlock(t->lock); hrtimer_cancel(t->timer); t->cancelling = false; The store outside the critical section could overwrite a parallel requests t->cancelling assignment to true, to ensure the parallely executing callback observes its cancellation status. It would be necessary to clear this cancelling bit once hrtimer_cancel is done, but lack of serialization introduced races. Another option was explored where bpf_timer_start would clear the bit when (re)starting the timer under timer->lock. This would ensure serialized access to the cancelling bit, but may allow it to be cleared before in-flight hrtimer_cancel has finished executing, such that lockups can occur again. Thus, we choose an atomic counter to keep track of all outstanding cancellation requests and use it to prevent lockups in case callbacks attempt to cancel each other while executing in parallel. Reported-by: Dohyun Kim <dohyunkim@google.com> Reported-by: Neel Natu <neelnatu@google.com> Fixes: b00628b1c7d5 ("bpf: Introduce bpf timers.") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20240709185440.1104957-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> 10 July 2024, 22:59:44 UTC
af253ae bpf: fix order of args in call to bpf_map_kvcalloc The original function call passed size of smap->bucket before the number of buckets which raises the error 'calloc-transposed-args' on compilation. Vlastimil Babka added: The order of parameters can be traced back all the way to 6ac99e8f23d4 ("bpf: Introduce bpf sk local storage") accross several refactorings, and that's why the commit is used as a Fixes: tag. In v6.10-rc1, a different commit 2c321f3f70bc ("mm: change inlined allocation helpers to account at the call site") however exposed the order of args in a way that gcc-14 has enough visibility to start warning about it, because (in !CONFIG_MEMCG case) bpf_map_kvcalloc is then a macro alias for kvcalloc instead of a static inline wrapper. To sum up the warning happens when the following conditions are all met: - gcc-14 is used (didn't see it with gcc-13) - commit 2c321f3f70bc is present - CONFIG_MEMCG is not enabled in .config - CONFIG_WERROR turns this from a compiler warning to error Fixes: 6ac99e8f23d4 ("bpf: Introduce bpf sk local storage") Reviewed-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Christian Kujau <lists@nerdbynature.de> Signed-off-by: Mohammad Shehar Yaar Tausif <sheharyaar48@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Link: https://lore.kernel.org/r/20240710100521.15061-2-vbabka@suse.cz Signed-off-by: Alexei Starovoitov <ast@kernel.org> 10 July 2024, 22:31:19 UTC
9d9a2f2 Merge tag 'mm-hotfixes-stable-2024-07-10-13-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "21 hotfixes, 15 of which are cc:stable. No identifiable theme here - all are singleton patches, 19 are for MM" * tag 'mm-hotfixes-stable-2024-07-10-13-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (21 commits) mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio mm/hugetlb: fix potential race in __update_and_free_hugetlb_folio() filemap: replace pte_offset_map() with pte_offset_map_nolock() arch/xtensa: always_inline get_current() and current_thread_info() sched.h: always_inline alloc_tag_{save|restore} to fix modpost warnings MAINTAINERS: mailmap: update Lorenzo Stoakes's email address mm: fix crashes from deferred split racing folio migration lib/build_OID_registry: avoid non-destructive substitution for Perl < 5.13.2 compat mm: gup: stop abusing try_grab_folio nilfs2: fix kernel bug on rename operation of broken directory mm/hugetlb_vmemmap: fix race with speculative PFN walkers cachestat: do not flush stats in recency check mm/shmem: disable PMD-sized page cache if needed mm/filemap: skip to create PMD-sized page cache if needed mm/readahead: limit page cache size in page_cache_ra_order() mm/filemap: make MAX_PAGECACHE_ORDER acceptable to xarray mm/damon/core: merge regions aggressively when max_nr_regions is unmet Fix userfaultfd_api to return EINVAL as expected mm: vmalloc: check if a hash-index is in cpu_possible_mask mm: prevent derefencing NULL ptr in pfn_section_valid() ... 10 July 2024, 21:59:41 UTC
ef2b7eb Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "One core change that moves a disk start message to a location where it will only be printed once instead of twice plus a couple of error handling race fixes in the ufs driver" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: sd: Do not repeat the starting disk message scsi: ufs: core: Fix ufshcd_abort_one racing issue scsi: ufs: core: Fix ufshcd_clear_cmd racing issue 10 July 2024, 21:47:35 UTC
fea6b5e i2c: rcar: clear NO_RXDMA flag after resetting We should allow RXDMA only if the reset was really successful, so clear the flag after the reset call. Fixes: 0e864b552b23 ("i2c: rcar: reset controller is mandatory for Gen3+") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org> 10 July 2024, 20:52:30 UTC
d6e1712 Merge tag 'vfio-v6.10' of https://github.com/awilliam/linux-vfio Pull VFIO fix from Alex Williamson: - Recent stable backports are exposing a bug introduced in the v6.10 development cycle where a counter value is uninitialized. This leads to regressions in userspace drivers like QEMU where where the kernel might ask for an arbitrary buffer size or return out of memory itself based on a bogus value. Zero initialize the counter. (Yi Liu) * tag 'vfio-v6.10' of https://github.com/awilliam/linux-vfio: vfio/pci: Init the count variable in collecting hot-reset devices 10 July 2024, 19:00:43 UTC
f6963ab Merge tag 'bcachefs-2024-07-10' of https://evilpiepirate.org/git/bcachefs Pull bcachefs fixes from Kent Overstreet: - Switch some asserts to WARN() - Fix a few "transaction not locked" asserts in the data read retry paths and backpointers gc - Fix a race that would cause the journal to get stuck on a flush commit - Add missing fsck checks for the fragmentation LRU - The usual assorted ssorted syzbot fixes * tag 'bcachefs-2024-07-10' of https://evilpiepirate.org/git/bcachefs: (22 commits) bcachefs: Add missing bch2_trans_begin() bcachefs: Fix missing error check in journal_entry_btree_keys_validate() bcachefs: Warn on attempting a move with no replicas bcachefs: bch2_data_update_to_text() bcachefs: Log mount failure error code bcachefs: Fix undefined behaviour in eytzinger1_first() bcachefs: Mark bch_inode_info as SLAB_ACCOUNT bcachefs: Fix bch2_inode_insert() race path for tmpfiles closures: fix closure_sync + closure debugging bcachefs: Fix journal getting stuck on a flush commit bcachefs: io clock: run timer fns under clock lock bcachefs: Repair fragmentation_lru in alloc_write_key() bcachefs: add check for missing fragmentation in check_alloc_to_lru_ref() bcachefs: bch2_btree_write_buffer_maybe_flush() bcachefs: Add missing printbuf_tabstops_reset() calls bcachefs: Fix loop restart in bch2_btree_transactions_read() bcachefs: Fix bch2_read_retry_nodecode() bcachefs: Don't use the new_fs() bucket alloc path on an initialized fs bcachefs: Fix shift greater than integer size bcachefs: Change bch2_fs_journal_stop() BUG_ON() to warning ... 10 July 2024, 18:50:16 UTC
70c8e39 Merge tag 'usb-serial-6.10-rc8' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus Johan writes: USB-serial fixes for 6.10-rc8 Here's a fix for a long-standing issue in the mos7840 driver that can trigger a crash when resuming from system suspend. Included are also some new modem device ids. All have been in linux-next with no reported issues. * tag 'usb-serial-6.10-rc8' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial: USB: serial: mos7840: fix crash on resume USB: serial: option: add Rolling RW350-GL variants USB: serial: option: add support for Foxconn T99W651 USB: serial: option: add Netprisma LCUK54 series modules 10 July 2024, 17:55:07 UTC
fd80d14 bcachefs: fix scheduling while atomic in break_cycle() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> 10 July 2024, 16:59:28 UTC
6f692b1 bcachefs: Fix RCU splat Reported-by: syzbot+e74fea078710bbca6f4b@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> 10 July 2024, 16:46:22 UTC
a19ea42 Merge tag 'platform-drivers-x86-v6.10-6' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fix from Hans de Goede: "One-liner fix for a dmi_system_id array in the toshiba_acpi driver not being terminated properly. Something which somehow has escaped detection since being introduced in 2022 until now" * tag 'platform-drivers-x86-v6.10-6' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: toshiba_acpi: Fix array out-of-bounds access 10 July 2024, 16:08:22 UTC
97488b9 Merge tag 'acpi-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Fix the sorting of _CST output data in the ACPI processor idle driver (Kuan-Wei Chiu)" * tag 'acpi-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: processor_idle: Fix invalid comparison with insertion sort for latency 10 July 2024, 16:05:22 UTC
130abfe Merge tag 'pm-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "Fix two issues related to boost frequencies handling, one in the cpufreq core and one in the ACPI cpufreq driver (Mario Limonciello)" * tag 'pm-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: ACPI: Mark boost policy as enabled when setting boost cpufreq: Allow drivers to advertise boost enabled 10 July 2024, 16:03:21 UTC
d045c46 Merge tag 'thermal-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control fixes from Rafael Wysocki: "These fix a possible NULL pointer dereference in a thermal governor, fix up the handling of thermal zones enabled before their temperature can be determined and fix list sorting during thermal zone temperature updates. Specifics: - Prevent the Power Allocator thermal governor from dereferencing a NULL pointer if it is bound to a tripless thermal zone (Nícolas Prado) - Prevent thermal zones enabled too early from staying effectively dormant forever because their temperature cannot be determined initially (Rafael Wysocki) - Fix list sorting during thermal zone temperature updates to ensure the proper ordering of trip crossing notifications (Rafael Wysocki)" * tag 'thermal-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal: core: Fix list sorting in __thermal_zone_device_update() thermal: core: Call monitor_thermal_zone() if zone temperature is invalid thermal: gov_power_allocator: Return early in manage if trip_max is NULL 10 July 2024, 16:00:55 UTC
367cbaa Merge tag 'devicetree-fixes-for-6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree fix from Rob Herring: - One fix for PASemi Nemo board interrupts * tag 'devicetree-fixes-for-6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of/irq: Disable "interrupt-map" parsing for PASEMI Nemo 10 July 2024, 15:58:50 UTC
5a88a3f vfio/pci: Init the count variable in collecting hot-reset devices The count variable is used without initialization, it results in mistakes in the device counting and crashes the userspace if the get hot reset info path is triggered. Fixes: f6944d4a0b87 ("vfio/pci: Collect hot-reset devices to local buffer") Link: https://bugzilla.kernel.org/show_bug.cgi?id=219010 Reported-by: Žilvinas Žaltiena <zaltys@natrix.lt> Cc: Beld Zhang <beldzhang@gmail.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240710004150.319105-1-yi.l.liu@intel.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> 10 July 2024, 14:47:46 UTC
b6e02c6 platform/x86: toshiba_acpi: Fix array out-of-bounds access In order to use toshiba_dmi_quirks[] together with the standard DMI matching functions, it must be terminated by a empty entry. Since this entry is missing, an array out-of-bounds access occurs every time the quirk list is processed. Fix this by adding the terminating empty entry. Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202407091536.8b116b3d-lkp@intel.com Fixes: 3cb1f40dfdc3 ("drivers/platform: toshiba_acpi: Call HCI_PANEL_POWER_ON on resume on some models") Cc: stable@vger.kernel.org Signed-off-by: Armin Wolf <W_Armin@gmx.de> Link: https://lore.kernel.org/r/20240709143851.10097-1-W_Armin@gmx.de Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> 10 July 2024, 14:12:12 UTC
7d7f71c bcachefs: Add missing bch2_trans_begin() this fixes a 'transaction should be locked' error in backpointers fsck Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> 10 July 2024, 13:53:39 UTC
0f6f8f7 bcachefs: Fix missing error check in journal_entry_btree_keys_validate() Closes: https://syzkaller.appspot.com/bug?extid=8996d8f176cf946ef641 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> 10 July 2024, 13:53:39 UTC
f49d2c9 bcachefs: Warn on attempting a move with no replicas Instead of popping an assert in bch2_write(), WARN and print out some debugging info. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> 10 July 2024, 13:53:39 UTC
ad8b68c bcachefs: bch2_data_update_to_text() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev> 10 July 2024, 13:53:39 UTC
back to top