sort by:
Revision Author Date Message Commit Date
39eb028 cxgb4: fix wrong shift. While fixing coverity warning, commit dd2c79677375 introduced typo in shift value. Fix that. Signed-off-by: Pavel Machek (CIP) <pavel@denx.de> Fixes: dd2c79677375 ("cxgb4: Fix unintentional sign extension issues") Signed-off-by: David S. Miller <davem@davemloft.net> 18 June 2021, 18:56:33 UTC
1c200f8 net: qed: Fix memcpy() overflow of qed_dcbx_params() The source (&dcbx_info->operational.params) and dest (&p_hwfn->p_dcbx_info->set.config.params) are both struct qed_dcbx_params (560 bytes), not struct qed_dcbx_admin_params (564 bytes), which is used as the memcpy() size. However it seems that struct qed_dcbx_operational_params (dcbx_info->operational)'s layout matches struct qed_dcbx_admin_params (p_hwfn->p_dcbx_info->set.config)'s 4 byte difference (3 padding, 1 byte for "valid"). On the assumption that the size is wrong (rather than the source structure type), adjust the memcpy() size argument to be 4 bytes smaller and add a BUILD_BUG_ON() to validate any changes to the structure sizes. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net> 17 June 2021, 19:14:51 UTC
c3b26fd net: cdc_eem: fix tx fixup skb leak when usbnet transmit a skb, eem fixup it in eem_tx_fixup(), if skb_copy_expand() failed, it return NULL, usbnet_start_xmit() will have no chance to free original skb. fix it by free orginal skb in eem_tx_fixup() first, then check skb clone status, if failed, return NULL to usbnet. Fixes: 9f722c0978b0 ("usbnet: CDC EEM support (v5)") Signed-off-by: Linyu Yuan <linyyuan@codeaurora.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David S. Miller <davem@davemloft.net> 17 June 2021, 18:30:25 UTC
bc39f67 Merge tag 'mlx5-fixes-2021-06-16' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2021-06-16 This series introduces some fixes to mlx5 driver. Please pull and let me know if there is any problem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 17 June 2021, 18:26:30 UTC
7edcc68 net: hamradio: fix memory leak in mkiss_close My local syzbot instance hit memory leak in mkiss_open()[1]. The problem was in missing free_netdev() in mkiss_close(). In mkiss_open() netdevice is allocated and then registered, but in mkiss_close() netdevice was only unregistered, but not freed. Fail log: BUG: memory leak unreferenced object 0xffff8880281ba000 (size 4096): comm "syz-executor.1", pid 11443, jiffies 4295046091 (age 17.660s) hex dump (first 32 bytes): 61 78 30 00 00 00 00 00 00 00 00 00 00 00 00 00 ax0............. 00 27 fa 2a 80 88 ff ff 00 00 00 00 00 00 00 00 .'.*............ backtrace: [<ffffffff81a27201>] kvmalloc_node+0x61/0xf0 [<ffffffff8706e7e8>] alloc_netdev_mqs+0x98/0xe80 [<ffffffff84e64192>] mkiss_open+0xb2/0x6f0 [1] [<ffffffff842355db>] tty_ldisc_open+0x9b/0x110 [<ffffffff84236488>] tty_set_ldisc+0x2e8/0x670 [<ffffffff8421f7f3>] tty_ioctl+0xda3/0x1440 [<ffffffff81c9f273>] __x64_sys_ioctl+0x193/0x200 [<ffffffff8911263a>] do_syscall_64+0x3a/0xb0 [<ffffffff89200068>] entry_SYSCALL_64_after_hwframe+0x44/0xae BUG: memory leak unreferenced object 0xffff8880141a9a00 (size 96): comm "syz-executor.1", pid 11443, jiffies 4295046091 (age 17.660s) hex dump (first 32 bytes): e8 a2 1b 28 80 88 ff ff e8 a2 1b 28 80 88 ff ff ...(.......(.... 98 92 9c aa b0 40 02 00 00 00 00 00 00 00 00 00 .....@.......... backtrace: [<ffffffff8709f68b>] __hw_addr_create_ex+0x5b/0x310 [<ffffffff8709fb38>] __hw_addr_add_ex+0x1f8/0x2b0 [<ffffffff870a0c7b>] dev_addr_init+0x10b/0x1f0 [<ffffffff8706e88b>] alloc_netdev_mqs+0x13b/0xe80 [<ffffffff84e64192>] mkiss_open+0xb2/0x6f0 [1] [<ffffffff842355db>] tty_ldisc_open+0x9b/0x110 [<ffffffff84236488>] tty_set_ldisc+0x2e8/0x670 [<ffffffff8421f7f3>] tty_ioctl+0xda3/0x1440 [<ffffffff81c9f273>] __x64_sys_ioctl+0x193/0x200 [<ffffffff8911263a>] do_syscall_64+0x3a/0xb0 [<ffffffff89200068>] entry_SYSCALL_64_after_hwframe+0x44/0xae BUG: memory leak unreferenced object 0xffff8880219bfc00 (size 512): comm "syz-executor.1", pid 11443, jiffies 4295046091 (age 17.660s) hex dump (first 32 bytes): 00 a0 1b 28 80 88 ff ff 80 8f b1 8d ff ff ff ff ...(............ 80 8f b1 8d ff ff ff ff 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff81a27201>] kvmalloc_node+0x61/0xf0 [<ffffffff8706eec7>] alloc_netdev_mqs+0x777/0xe80 [<ffffffff84e64192>] mkiss_open+0xb2/0x6f0 [1] [<ffffffff842355db>] tty_ldisc_open+0x9b/0x110 [<ffffffff84236488>] tty_set_ldisc+0x2e8/0x670 [<ffffffff8421f7f3>] tty_ioctl+0xda3/0x1440 [<ffffffff81c9f273>] __x64_sys_ioctl+0x193/0x200 [<ffffffff8911263a>] do_syscall_64+0x3a/0xb0 [<ffffffff89200068>] entry_SYSCALL_64_after_hwframe+0x44/0xae BUG: memory leak unreferenced object 0xffff888029b2b200 (size 256): comm "syz-executor.1", pid 11443, jiffies 4295046091 (age 17.660s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff81a27201>] kvmalloc_node+0x61/0xf0 [<ffffffff8706f062>] alloc_netdev_mqs+0x912/0xe80 [<ffffffff84e64192>] mkiss_open+0xb2/0x6f0 [1] [<ffffffff842355db>] tty_ldisc_open+0x9b/0x110 [<ffffffff84236488>] tty_set_ldisc+0x2e8/0x670 [<ffffffff8421f7f3>] tty_ioctl+0xda3/0x1440 [<ffffffff81c9f273>] __x64_sys_ioctl+0x193/0x200 [<ffffffff8911263a>] do_syscall_64+0x3a/0xb0 [<ffffffff89200068>] entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: 815f62bf7427 ("[PATCH] SMP rewrite of mkiss") Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> 17 June 2021, 18:24:54 UTC
c19c8c0 be2net: Fix an error handling path in 'be_probe()' If an error occurs after a 'pci_enable_pcie_error_reporting()' call, it must be undone by a corresponding 'pci_disable_pcie_error_reporting()' call, as already done in the remove function. Fixes: d6b6d9877878 ("be2net: use PCIe AER capability") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> 17 June 2021, 18:24:06 UTC
0232fc2 net/mlx5: Reset mkey index on creation Reset only the index part of the mkey and keep the variant part. On devlink reload, driver recreates mkeys, so the mkey index may change. Trying to preserve the variant part of the mkey, driver mistakenly merged the mkey index with current value. In case of a devlink reload, current value of index part is dirty, so the index may be corrupted. Fixes: 54c62e13ad76 ("{IB,net}/mlx5: Setup mkey variant before mr create command invocation") Signed-off-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Amir Tzin <amirtz@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> 16 June 2021, 22:36:49 UTC
a5ae8fc net/mlx5e: Don't create devices during unload flow Running devlink reload command for port in switchdev mode cause resources to corrupt: driver can't release allocated EQ and reclaim memory pages, because "rdma" auxiliary device had add CQs which blocks EQ from deletion. Erroneous sequence happens during reload-down phase, and is following: 1. detach device - suspends auxiliary devices which support it, destroys others. During this step "eth-rep" and "rdma-rep" are destroyed, "eth" - suspended. 2. disable SRIOV - moves device to legacy mode; as part of disablement - rescans drivers. This step adds "rdma" auxiliary device. 3. destroy EQ table - <failure>. Driver shouldn't create any device during unload flows. To handle that implement MLX5_PRIV_FLAGS_DETACH flag, set it on device detach and unset on device attach. If flag is set do no-op on drivers rescan. Fixes: a925b5e309c9 ("net/mlx5: Register mlx5 devices to auxiliary virtual bus") Signed-off-by: Dmytro Linkin <dlinkin@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> 16 June 2021, 22:36:47 UTC
65fb7d1 net/mlx5: DR, Fix STEv1 incorrect L3 decapsulation padding Decapsulation L3 on small inner packets which are less than 64 Bytes was done incorrectly. In small packets there is an extra padding added in L2 which should not be included in L3 length. The issue was that after decapL3 the extra L2 padding caused an update on the L3 length. To avoid this issue the new header is pushed to the beginning of the packet (offset 0) which should not cause a HW reparse and update the L3 length. Fixes: c349b4137cfd ("net/mlx5: DR, Add STEv1 modify header logic") Reviewed-by: Erez Shitrit <erezsh@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> 16 June 2021, 22:36:45 UTC
c7d6c19 net/mlx5: SF_DEV, remove SF device on invalid state When auxiliary bus autoprobe is disabled and SF is in ACTIVE state, on SF port deletion it transitions from ACTIVE->ALLOCATED->INVALID. When VHCA event handler queries the state, it is already transition to INVALID state. In this scenario, event handler missed to delete the SF device. Fix it by deleting the SF when SF state is INVALID. Fixes: 90d010b8634b ("net/mlx5: SF, Add auxiliary device support") Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Vu Pham <vuhuong@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> 16 June 2021, 22:36:42 UTC
ca36fc4 net/mlx5: E-Switch, Allow setting GUID for host PF vport E-switch should be able to set the GUID of host PF vport. Currently it returns an error. This results in below error when user attempts to configure MAC address of the PF of an external controller. $ devlink port function set pci/0000:03:00.0/196608 \ hw_addr 00:00:00:11:22:33 mlx5_core 0000:03:00.0: mlx5_esw_set_vport_mac_locked:1876:(pid 6715):\ "Failed to set vport 0 node guid, err = -22. RDMA_CM will not function properly for this VF." Check for zero vport is no longer needed. Fixes: 330077d14de1 ("net/mlx5: E-switch, Supporting setting devlink port function mac address") Signed-off-by: Yuval Avnery <yuvalav@nvidia.com> Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Bodong Wang <bodong@nvidia.com> Reviewed-by: Alaa Hleihel <alaa@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> 16 June 2021, 22:36:40 UTC
bbc8222 net/mlx5: E-Switch, Read PF mac address External controller PF's MAC address is not read from the device during vport setup. Fail to read this results in showing all zeros to user while the factory programmed MAC is a valid value. $ devlink port show eth1 -jp { "port": { "pci/0000:03:00.0/196608": { "type": "eth", "netdev": "eth1", "flavour": "pcipf", "controller": 1, "pfnum": 0, "splittable": false, "function": { "hw_addr": "00:00:00:00:00:00" } } } } Hence, read it when enabling a vport. After the fix, $ devlink port show eth1 -jp { "port": { "pci/0000:03:00.0/196608": { "type": "eth", "netdev": "eth1", "flavour": "pcipf", "controller": 1, "pfnum": 0, "splittable": false, "function": { "hw_addr": "98:03:9b:a0:60:11" } } } } Fixes: f099fde16db3 ("net/mlx5: E-switch, Support querying port function mac address") Signed-off-by: Bodong Wang <bodong@nvidia.com> Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Alaa Hleihel <alaa@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> 16 June 2021, 22:36:38 UTC
2058cc9 net/mlx5: Check that driver was probed prior attaching the device The device can be requested to be attached despite being not probed. This situation is possible if devlink reload races with module removal, and the following kernel panic is an outcome of such race. mlx5_core 0000:00:09.0: firmware version: 4.7.9999 mlx5_core 0000:00:09.0: 0.000 Gb/s available PCIe bandwidth (8.0 GT/s PCIe x255 link) BUG: unable to handle page fault for address: fffffffffffffff0 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 3218067 P4D 3218067 PUD 321a067 PMD 0 Oops: 0000 [#1] SMP KASAN NOPTI CPU: 7 PID: 250 Comm: devlink Not tainted 5.12.0-rc2+ #2836 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:mlx5_attach_device+0x80/0x280 [mlx5_core] Code: f8 48 c1 e8 03 42 80 3c 38 00 0f 85 80 01 00 00 48 8b 45 68 48 8d 78 f0 48 89 fe 48 c1 ee 03 42 80 3c 3e 00 0f 85 70 01 00 00 <48> 8b 40 f0 48 85 c0 74 0d 48 89 ef ff d0 85 c0 0f 85 84 05 0e 00 RSP: 0018:ffff8880129675f0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffffffff827407f1 RDX: 1ffff110011336cf RSI: 1ffffffffffffffe RDI: fffffffffffffff0 RBP: ffff888008e0c000 R08: 0000000000000008 R09: ffffffffa0662ee7 R10: fffffbfff40cc5dc R11: 0000000000000000 R12: ffff88800ea002e0 R13: ffffed1001d459f7 R14: ffffffffa05ef4f8 R15: dffffc0000000000 FS: 00007f51dfeaf740(0000) GS:ffff88806d5c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: fffffffffffffff0 CR3: 000000000bc82006 CR4: 0000000000370ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: mlx5_load_one+0x117/0x1d0 [mlx5_core] devlink_reload+0x2d5/0x520 ? devlink_remote_reload_actions_performed+0x30/0x30 ? mutex_trylock+0x24b/0x2d0 ? devlink_nl_cmd_reload+0x62b/0x1070 devlink_nl_cmd_reload+0x66d/0x1070 ? devlink_reload+0x520/0x520 ? devlink_nl_pre_doit+0x64/0x4d0 genl_family_rcv_msg_doit+0x1e9/0x2f0 ? mutex_lock_io_nested+0x1130/0x1130 ? genl_family_rcv_msg_attrs_parse.constprop.0+0x240/0x240 ? security_capable+0x51/0x90 genl_rcv_msg+0x27f/0x4a0 ? genl_get_cmd+0x3c0/0x3c0 ? lock_acquire+0x1a9/0x6d0 ? devlink_reload+0x520/0x520 ? lock_release+0x6c0/0x6c0 netlink_rcv_skb+0x11d/0x340 ? genl_get_cmd+0x3c0/0x3c0 ? netlink_ack+0x9f0/0x9f0 ? lock_release+0x1f9/0x6c0 genl_rcv+0x24/0x40 netlink_unicast+0x433/0x700 ? netlink_attachskb+0x730/0x730 ? _copy_from_iter_full+0x178/0x650 ? __alloc_skb+0x113/0x2b0 netlink_sendmsg+0x6f1/0xbd0 ? netlink_unicast+0x700/0x700 ? netlink_unicast+0x700/0x700 sock_sendmsg+0xb0/0xe0 __sys_sendto+0x193/0x240 ? __x64_sys_getpeername+0xb0/0xb0 ? copy_page_range+0x2300/0x2300 ? __up_read+0x1a1/0x7b0 ? do_user_addr_fault+0x219/0xdc0 __x64_sys_sendto+0xdd/0x1b0 ? syscall_enter_from_user_mode+0x1d/0x50 do_syscall_64+0x2d/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f51dffb514a Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 c3 0f 1f 44 00 00 55 48 83 ec 30 44 89 4c RSP: 002b:00007ffcaef22e78 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f51dffb514a RDX: 0000000000000030 RSI: 000055750daf2440 RDI: 0000000000000003 RBP: 000055750daf2410 R08: 00007f51e0081200 R09: 000000000000000c R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Modules linked in: mlx5_core(-) ptp pps_core ib_ipoib rdma_ucm rdma_cm iw_cm ib_cm ib_umad ib_uverbs ib_core [last unloaded: mlx5_ib] CR2: fffffffffffffff0 ---[ end trace 7789831bfe74fa42 ]--- Fixes: a925b5e309c9 ("net/mlx5: Register mlx5 devices to auxiliary virtual bus") Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> 16 June 2021, 22:36:35 UTC
94a4b84 net/mlx5: Fix error path for set HCA defaults In the case of the failure to execute mlx5_core_set_hca_defaults(), we used wrong goto label to execute error unwind flow. Fixes: 5bef709d76a2 ("net/mlx5: Enable host PF HCA after eswitch is initialized") Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> 16 June 2021, 22:36:32 UTC
da5ac77 r8169: Avoid memcpy() over-reading of ETH_SS_STATS In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memcpy(), memmove(), and memset(), avoid intentionally reading across neighboring array fields. The memcpy() is copying the entire structure, not just the first array. Adjust the source argument so the compiler can do appropriate bounds checking. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net> 16 June 2021, 20:02:07 UTC
224004f sh_eth: Avoid memcpy() over-reading of ETH_SS_STATS In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memcpy(), memmove(), and memset(), avoid intentionally reading across neighboring array fields. The memcpy() is copying the entire structure, not just the first array. Adjust the source argument so the compiler can do appropriate bounds checking. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net> 16 June 2021, 20:02:07 UTC
99718ab r8152: Avoid memcpy() over-reading of ETH_SS_STATS In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memcpy(), memmove(), and memset(), avoid intentionally reading across neighboring array fields. The memcpy() is copying the entire structure, not just the first array. Adjust the source argument so the compiler can do appropriate bounds checking. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net> 16 June 2021, 20:02:06 UTC
1b29df0 selftests: net: use bash to run udpgro_fwd test case udpgro_fwd.sh contains many bash specific operators ("[[", "local -r"), but it's using /bin/sh; in some distro /bin/sh is mapped to /bin/dash, that doesn't support such operators. Force the test to use /bin/bash explicitly and prevent false positive test failures. Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> 16 June 2021, 19:56:10 UTC
a494bd6 net/af_unix: fix a data-race in unix_dgram_sendmsg / unix_release_sock While unix_may_send(sk, osk) is called while osk is locked, it appears unix_release_sock() can overwrite unix_peer() after this lock has been released, making KCSAN unhappy. Changing unix_release_sock() to access/change unix_peer() before lock is released should fix this issue. BUG: KCSAN: data-race in unix_dgram_sendmsg / unix_release_sock write to 0xffff88810465a338 of 8 bytes by task 20852 on cpu 1: unix_release_sock+0x4ed/0x6e0 net/unix/af_unix.c:558 unix_release+0x2f/0x50 net/unix/af_unix.c:859 __sock_release net/socket.c:599 [inline] sock_close+0x6c/0x150 net/socket.c:1258 __fput+0x25b/0x4e0 fs/file_table.c:280 ____fput+0x11/0x20 fs/file_table.c:313 task_work_run+0xae/0x130 kernel/task_work.c:164 tracehook_notify_resume include/linux/tracehook.h:189 [inline] exit_to_user_mode_loop kernel/entry/common.c:175 [inline] exit_to_user_mode_prepare+0x156/0x190 kernel/entry/common.c:209 __syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline] syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:302 do_syscall_64+0x56/0x90 arch/x86/entry/common.c:57 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff88810465a338 of 8 bytes by task 20888 on cpu 0: unix_may_send net/unix/af_unix.c:189 [inline] unix_dgram_sendmsg+0x923/0x1610 net/unix/af_unix.c:1712 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg net/socket.c:674 [inline] ____sys_sendmsg+0x360/0x4d0 net/socket.c:2350 ___sys_sendmsg net/socket.c:2404 [inline] __sys_sendmmsg+0x315/0x4b0 net/socket.c:2490 __do_sys_sendmmsg net/socket.c:2519 [inline] __se_sys_sendmmsg net/socket.c:2516 [inline] __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2516 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0xffff888167905400 -> 0x0000000000000000 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 20888 Comm: syz-executor.0 Not tainted 5.13.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> 16 June 2021, 19:51:55 UTC
0fd158b selftests: net: veth: make test compatible with dash veth.sh is a shell script that uses /bin/sh; some distro (Ubuntu for example) use dash as /bin/sh and in this case the test reports the following error: # ./veth.sh: 21: local: -r: bad variable name # ./veth.sh: 21: local: -r: bad variable name This happens because dash doesn't support the option "-r" with local. Moreover, in case of missing bpf object, the script is exiting -1, that is an illegal number for dash: exit: Illegal number: -1 Change the script to be compatible both with bash and dash and prevent the errors above. Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> 16 June 2021, 19:50:24 UTC
1d2ac20 Merge branch 'net-packet-data-races' Eric Dumazet says: ==================== net/packet: annotate data races KCSAN sent two reports about data races in af_packet. Nothing serious, but worth fixing. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 16 June 2021, 19:48:18 UTC
e032f7c net/packet: annotate accesses to po->ifindex Like prior patch, we need to annotate lockless accesses to po->ifindex For instance, packet_getname() is reading po->ifindex (twice) while another thread is able to change po->ifindex. KCSAN reported: BUG: KCSAN: data-race in packet_do_bind / packet_getname write to 0xffff888143ce3cbc of 4 bytes by task 25573 on cpu 1: packet_do_bind+0x420/0x7e0 net/packet/af_packet.c:3191 packet_bind+0xc3/0xd0 net/packet/af_packet.c:3255 __sys_bind+0x200/0x290 net/socket.c:1637 __do_sys_bind net/socket.c:1648 [inline] __se_sys_bind net/socket.c:1646 [inline] __x64_sys_bind+0x3d/0x50 net/socket.c:1646 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff888143ce3cbc of 4 bytes by task 25578 on cpu 0: packet_getname+0x5b/0x1a0 net/packet/af_packet.c:3525 __sys_getsockname+0x10e/0x1a0 net/socket.c:1887 __do_sys_getsockname net/socket.c:1902 [inline] __se_sys_getsockname net/socket.c:1899 [inline] __x64_sys_getsockname+0x3e/0x50 net/socket.c:1899 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0x00000000 -> 0x00000001 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 25578 Comm: syz-executor.5 Not tainted 5.13.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> 16 June 2021, 19:48:18 UTC
c7d2ef5 net/packet: annotate accesses to po->bind tpacket_snd(), packet_snd(), packet_getname() and packet_seq_show() can read po->num without holding a lock. This means other threads can change po->num at the same time. KCSAN complained about this known fact [1] Add READ_ONCE()/WRITE_ONCE() to address the issue. [1] BUG: KCSAN: data-race in packet_do_bind / packet_sendmsg write to 0xffff888131a0dcc0 of 2 bytes by task 24714 on cpu 0: packet_do_bind+0x3ab/0x7e0 net/packet/af_packet.c:3181 packet_bind+0xc3/0xd0 net/packet/af_packet.c:3255 __sys_bind+0x200/0x290 net/socket.c:1637 __do_sys_bind net/socket.c:1648 [inline] __se_sys_bind net/socket.c:1646 [inline] __x64_sys_bind+0x3d/0x50 net/socket.c:1646 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff888131a0dcc0 of 2 bytes by task 24719 on cpu 1: packet_snd net/packet/af_packet.c:2899 [inline] packet_sendmsg+0x317/0x3570 net/packet/af_packet.c:3040 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg net/socket.c:674 [inline] ____sys_sendmsg+0x360/0x4d0 net/socket.c:2350 ___sys_sendmsg net/socket.c:2404 [inline] __sys_sendmsg+0x1ed/0x270 net/socket.c:2433 __do_sys_sendmsg net/socket.c:2442 [inline] __se_sys_sendmsg net/socket.c:2440 [inline] __x64_sys_sendmsg+0x42/0x50 net/socket.c:2440 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0x0000 -> 0x1200 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 24719 Comm: syz-executor.5 Not tainted 5.13.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> 16 June 2021, 19:48:18 UTC
e82a35a Merge tag 'linux-can-fixes-for-5.13-20210616' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2021-06-16 this is a pull request of 4 patches for net/master. The first patch is by Oleksij Rempel and fixes a Use-after-Free found by syzbot in the j1939 stack. The next patch is by Tetsuo Handa and fixes hung task detected by syzbot in the bcm, raw and isotp protocols. Norbert Slusarek's patch fixes a infoleak in bcm's struct bcm_msg_head. Pavel Skripkin's patch fixes a memory leak in the mcba_usb driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 16 June 2021, 19:44:11 UTC
d8e2973 net: ipv4: fix memory leak in ip_mc_add1_src BUG: memory leak unreferenced object 0xffff888101bc4c00 (size 32): comm "syz-executor527", pid 360, jiffies 4294807421 (age 19.329s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 01 00 00 00 00 00 00 00 ac 14 14 bb 00 00 02 00 ................ backtrace: [<00000000f17c5244>] kmalloc include/linux/slab.h:558 [inline] [<00000000f17c5244>] kzalloc include/linux/slab.h:688 [inline] [<00000000f17c5244>] ip_mc_add1_src net/ipv4/igmp.c:1971 [inline] [<00000000f17c5244>] ip_mc_add_src+0x95f/0xdb0 net/ipv4/igmp.c:2095 [<000000001cb99709>] ip_mc_source+0x84c/0xea0 net/ipv4/igmp.c:2416 [<0000000052cf19ed>] do_ip_setsockopt net/ipv4/ip_sockglue.c:1294 [inline] [<0000000052cf19ed>] ip_setsockopt+0x114b/0x30c0 net/ipv4/ip_sockglue.c:1423 [<00000000477edfbc>] raw_setsockopt+0x13d/0x170 net/ipv4/raw.c:857 [<00000000e75ca9bb>] __sys_setsockopt+0x158/0x270 net/socket.c:2117 [<00000000bdb993a8>] __do_sys_setsockopt net/socket.c:2128 [inline] [<00000000bdb993a8>] __se_sys_setsockopt net/socket.c:2125 [inline] [<00000000bdb993a8>] __x64_sys_setsockopt+0xba/0x150 net/socket.c:2125 [<000000006a1ffdbd>] do_syscall_64+0x40/0x80 arch/x86/entry/common.c:47 [<00000000b11467c4>] entry_SYSCALL_64_after_hwframe+0x44/0xae In commit 24803f38a5c0 ("igmp: do not remove igmp souce list info when set link down"), the ip_mc_clear_src() in ip_mc_destroy_dev() was removed, because it was also called in igmpv3_clear_delrec(). Rough callgraph: inetdev_destroy -> ip_mc_destroy_dev -> igmpv3_clear_delrec -> ip_mc_clear_src -> RCU_INIT_POINTER(dev->ip_ptr, NULL) However, ip_mc_clear_src() called in igmpv3_clear_delrec() doesn't release in_dev->mc_list->sources. And RCU_INIT_POINTER() assigns the NULL to dev->ip_ptr. As a result, in_dev cannot be obtained through inetdev_by_index() and then in_dev->mc_list->sources cannot be released by ip_mc_del1_src() in the sock_close. Rough call sequence goes like: sock_close -> __sock_release -> inet_release -> ip_mc_drop_socket -> inetdev_by_index -> ip_mc_leave_src -> ip_mc_del_src -> ip_mc_del1_src So we still need to call ip_mc_clear_src() in ip_mc_destroy_dev() to free in_dev->mc_list->sources. Fixes: 24803f38a5c0 ("igmp: do not remove igmp souce list info ...") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Chengyang Fan <cy.fan@huawei.com> Acked-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> 16 June 2021, 19:41:01 UTC
c0d982b Merge branch 'fec-ptp-fixes' Joakim Zhang says: ==================== net: fixes for fec ptp Small fixes for fec ptp. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 16 June 2021, 19:39:21 UTC
d237656 net: fec_ptp: fix issue caused by refactor the fec_devtype Commit da722186f654 ("net: fec: set GPR bit on suspend by DT configuration.") refactor the fec_devtype, need adjust ptp driver accordingly. Fixes: da722186f654 ("net: fec: set GPR bit on suspend by DT configuration.") Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> 16 June 2021, 19:39:03 UTC
cb3cefe net: fec_ptp: add clock rate zero check Add clock rate zero check to fix coverity issue of "divide by 0". Fixes: commit 85bd1798b24a ("net: fec: fix spin_lock dead lock") Signed-off-by: Fugang Duan <fugang.duan@nxp.com> Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> 16 June 2021, 19:39:03 UTC
56b786d net: usb: fix possible use-after-free in smsc75xx_bind The commit 46a8b29c6306 ("net: usb: fix memory leak in smsc75xx_bind") fails to clean up the work scheduled in smsc75xx_reset-> smsc75xx_set_multicast, which leads to use-after-free if the work is scheduled to start after the deallocation. In addition, this patch also removes a dangling pointer - dev->data[0]. This patch calls cancel_work_sync to cancel the scheduled work and set the dangling pointer to NULL. Fixes: 46a8b29c6306 ("net: usb: fix memory leak in smsc75xx_bind") Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> 16 June 2021, 19:36:09 UTC
8f26910 net: stmmac: disable clocks in stmmac_remove_config_dt() Platform drivers may call stmmac_probe_config_dt() to parse dt, could call stmmac_remove_config_dt() in error handing after dt parsed, so need disable clocks in stmmac_remove_config_dt(). Go through all platforms drivers which use stmmac_probe_config_dt(), none of them disable clocks manually, so it's safe to disable them in stmmac_remove_config_dt(). Fixes: commit d2ed0a7755fe ("net: ethernet: stmmac: fix of-node and fixed-link-phydev leaks") Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> 16 June 2021, 19:20:58 UTC
91c0255 can: mcba_usb: fix memory leak in mcba_usb Syzbot reported memory leak in SocketCAN driver for Microchip CAN BUS Analyzer Tool. The problem was in unfreed usb_coherent. In mcba_usb_start() 20 coherent buffers are allocated and there is nothing, that frees them: 1) In callback function the urb is resubmitted and that's all 2) In disconnect function urbs are simply killed, but URB_FREE_BUFFER is not set (see mcba_usb_start) and this flag cannot be used with coherent buffers. Fail log: | [ 1354.053291][ T8413] mcba_usb 1-1:0.0 can0: device disconnected | [ 1367.059384][ T8420] kmemleak: 20 new suspected memory leaks (see /sys/kernel/debug/kmem) So, all allocated buffers should be freed with usb_free_coherent() explicitly NOTE: The same pattern for allocating and freeing coherent buffers is used in drivers/net/can/usb/kvaser_usb/kvaser_usb_core.c Fixes: 51f3baad7de9 ("can: mcba_usb: Add support for Microchip CAN BUS Analyzer") Link: https://lore.kernel.org/r/20210609215833.30393-1-paskripkin@gmail.com Cc: linux-stable <stable@vger.kernel.org> Reported-and-tested-by: syzbot+57281c762a3922e14dfe@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> 16 June 2021, 10:52:18 UTC
5e87ddb can: bcm: fix infoleak in struct bcm_msg_head On 64-bit systems, struct bcm_msg_head has an added padding of 4 bytes between struct members count and ival1. Even though all struct members are initialized, the 4-byte hole will contain data from the kernel stack. This patch zeroes out struct bcm_msg_head before usage, preventing infoleaks to userspace. Fixes: ffd980f976e7 ("[CAN]: Add broadcast manager (bcm) protocol") Link: https://lore.kernel.org/r/trinity-7c1b2e82-e34f-4885-8060-2cd7a13769ce-1623532166177@3c-app-gmx-bs52 Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Norbert Slusarek <nslusarek@gmx.net> Acked-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> 16 June 2021, 10:52:18 UTC
8d0caed can: bcm/raw/isotp: use per module netdevice notifier syzbot is reporting hung task at register_netdevice_notifier() [1] and unregister_netdevice_notifier() [2], for cleanup_net() might perform time consuming operations while CAN driver's raw/bcm/isotp modules are calling {register,unregister}_netdevice_notifier() on each socket. Change raw/bcm/isotp modules to call register_netdevice_notifier() from module's __init function and call unregister_netdevice_notifier() from module's __exit function, as with gw/j1939 modules are doing. Link: https://syzkaller.appspot.com/bug?id=391b9498827788b3cc6830226d4ff5be87107c30 [1] Link: https://syzkaller.appspot.com/bug?id=1724d278c83ca6e6df100a2e320c10d991cf2bce [2] Link: https://lore.kernel.org/r/54a5f451-05ed-f977-8534-79e7aa2bcc8f@i-love.sakura.ne.jp Cc: linux-stable <stable@vger.kernel.org> Reported-by: syzbot <syzbot+355f8edb2ff45d5f95fa@syzkaller.appspotmail.com> Reported-by: syzbot <syzbot+0f1827363a305f74996f@syzkaller.appspotmail.com> Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com> Tested-by: syzbot <syzbot+355f8edb2ff45d5f95fa@syzkaller.appspotmail.com> Tested-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> 16 June 2021, 10:52:18 UTC
2030043 can: j1939: fix Use-after-Free, hold skb ref while in use This patch fixes a Use-after-Free found by the syzbot. The problem is that a skb is taken from the per-session skb queue, without incrementing the ref count. This leads to a Use-after-Free if the skb is taken concurrently from the session queue due to a CTS. Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Link: https://lore.kernel.org/r/20210521115720.7533-1-o.rempel@pengutronix.de Cc: Hillf Danton <hdanton@sina.com> Cc: linux-stable <stable@vger.kernel.org> Reported-by: syzbot+220c1a29987a9a490903@syzkaller.appspotmail.com Reported-by: syzbot+45199c1b73b4013525cf@syzkaller.appspotmail.com Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> 16 June 2021, 10:52:18 UTC
a4f0377 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2021-06-15 The following pull-request contains BPF updates for your *net* tree. We've added 5 non-merge commits during the last 11 day(s) which contain a total of 10 files changed, 115 insertions(+), 16 deletions(-). The main changes are: 1) Fix marking incorrect umem ring as done in libbpf's xsk_socket__create_shared() helper, from Kev Jackson. 2) Fix oob leakage under a spectre v1 type confusion attack, from Daniel Borkmann. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 15 June 2021, 22:26:07 UTC
7ea6cd1 lantiq: net: fix duplicated skb in rx descriptor ring The previous commit didn't fix the bug properly. By mistake, it replaces the pointer of the next skb in the descriptor ring instead of the current one. As a result, the two descriptors are assigned the same SKB. The error is seen during the iperf test when skb_put tries to insert a second packet and exceeds the available buffer. Fixes: c7718ee96dbc ("net: lantiq: fix memory corruption in RX ring ") Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Signed-off-by: David S. Miller <davem@davemloft.net> 15 June 2021, 21:17:19 UTC
057d493 qmi_wwan: Do not call netif_rx from rx_fixup When the QMI_WWAN_FLAG_PASS_THROUGH is set, netif_rx() is called from qmi_wwan_rx_fixup(). When the call to netif_rx() is successful (which is most of the time), usbnet_skb_return() is called (from rx_process()). usbnet_skb_return() will then call netif_rx() a second time for the same skb. Simplify the code and avoid the redundant netif_rx() call by changing qmi_wwan_rx_fixup() to always return 1 when QMI_WWAN_FLAG_PASS_THROUGH is set. We then leave it up to the existing infrastructure to call netif_rx(). Suggested-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> 15 June 2021, 18:29:28 UTC
c1a3d40 net: cdc_ncm: switch to eth%d interface naming This is meant to make the host side cdc_ncm interface consistently named just like the older CDC protocols: cdc_ether & cdc_ecm (and even rndis_host), which all use 'FLAG_ETHER | FLAG_POINTTOPOINT'. include/linux/usb/usbnet.h: #define FLAG_ETHER 0x0020 /* maybe use "eth%d" names */ #define FLAG_WLAN 0x0080 /* use "wlan%d" names */ #define FLAG_WWAN 0x0400 /* use "wwan%d" names */ #define FLAG_POINTTOPOINT 0x1000 /* possibly use "usb%d" names */ drivers/net/usb/usbnet.c @ line 1711: strcpy (net->name, "usb%d"); ... // heuristic: "usb%d" for links we know are two-host, // else "eth%d" when there's reasonable doubt. userspace // can rename the link if it knows better. if ((dev->driver_info->flags & FLAG_ETHER) != 0 && ((dev->driver_info->flags & FLAG_POINTTOPOINT) == 0 || (net->dev_addr [0] & 0x02) == 0)) strcpy (net->name, "eth%d"); /* WLAN devices should always be named "wlan%d" */ if ((dev->driver_info->flags & FLAG_WLAN) != 0) strcpy(net->name, "wlan%d"); /* WWAN devices should always be named "wwan%d" */ if ((dev->driver_info->flags & FLAG_WWAN) != 0) strcpy(net->name, "wwan%d"); So by using ETHER | POINTTOPOINT the interface naming is either usb%d or eth%d based on the global uniqueness of the mac address of the device. Without this 2.5gbps ethernet dongles which all seem to use the cdc_ncm driver end up being called usb%d instead of eth%d even though they're definitely not two-host. (All 1gbps & 5gbps ethernet usb dongles I've tested don't hit this problem due to use of different drivers, primarily r8152 and aqc111) Fixes tag is based purely on git blame, and is really just here to make sure this hits LTS branches newer than v4.5. Cc: Lorenzo Colitti <lorenzo@google.com> Fixes: 4d06dd537f95 ("cdc_ncm: do not call usbnet_link_change from cdc_ncm_bind") Signed-off-by: Maciej Żenczykowski <maze@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> 15 June 2021, 18:24:57 UTC
e34492d net: inline function get_net_ns_by_fd if NET_NS is disabled The function get_net_ns_by_fd() could be inlined when NET_NS is not enabled. Signed-off-by: Changbin Du <changbin.du@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> 15 June 2021, 18:00:45 UTC
475b92f ptp: improve max_adj check against unreasonable values Scaled PPM conversion to PPB may (on 64bit systems) result in a value larger than s32 can hold (freq/scaled_ppm is a long). This means the kernel will not correctly reject unreasonably high ->freq values (e.g. > 4294967295ppb, 281474976645 scaled PPM). The conversion is equivalent to a division by ~66 (65.536), so the value of ppb is always smaller than ppm, but not small enough to assume narrowing the type from long -> s32 is okay. Note that reasonable user space (e.g. ptp4l) will not use such high values, anyway, 4289046510ppb ~= 4.3x, so the fix is somewhat pedantic. Fixes: d39a743511cd ("ptp: validate the requested frequency adjustment.") Fixes: d94ba80ebbea ("ptp: Added a brand new class driver for ptp clocks.") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> 15 June 2021, 17:59:46 UTC
2214fb5 net: mhi_net: Update the transmit handler prototype Update the function prototype of mhi_ndo_xmit to match ndo_start_xmit. This otherwise leads to run time failures when CFI is enabled in kernel. Fixes: 3ffec6a14f24 ("net: Add mhi-net driver") Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net> 14 June 2021, 21:13:09 UTC
973377f bpf, selftests: Adjust few selftest outcomes wrt unreachable code In almost all cases from test_verifier that have been changed in here, we've had an unreachable path with a load from a register which has an invalid address on purpose. This was basically to make sure that we never walk this path and to have the verifier complain if it would otherwise. Change it to match on the right error for unprivileged given we now test these paths under speculative execution. There's one case where we match on exact # of insns_processed. Due to the extra path, this will of course mismatch on unprivileged. Thus, restrict the test->insn_processed check to privileged-only. In one other case, we result in a 'pointer comparison prohibited' error. This is similarly due to verifying an 'invalid' branch where we end up with a value pointer on one side of the comparison. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> 14 June 2021, 21:06:38 UTC
9183671 bpf: Fix leakage under speculation on mispredicted branches The verifier only enumerates valid control-flow paths and skips paths that are unreachable in the non-speculative domain. And so it can miss issues under speculative execution on mispredicted branches. For example, a type confusion has been demonstrated with the following crafted program: // r0 = pointer to a map array entry // r6 = pointer to readable stack slot // r9 = scalar controlled by attacker 1: r0 = *(u64 *)(r0) // cache miss 2: if r0 != 0x0 goto line 4 3: r6 = r9 4: if r0 != 0x1 goto line 6 5: r9 = *(u8 *)(r6) 6: // leak r9 Since line 3 runs iff r0 == 0 and line 5 runs iff r0 == 1, the verifier concludes that the pointer dereference on line 5 is safe. But: if the attacker trains both the branches to fall-through, such that the following is speculatively executed ... r6 = r9 r9 = *(u8 *)(r6) // leak r9 ... then the program will dereference an attacker-controlled value and could leak its content under speculative execution via side-channel. This requires to mistrain the branch predictor, which can be rather tricky, because the branches are mutually exclusive. However such training can be done at congruent addresses in user space using different branches that are not mutually exclusive. That is, by training branches in user space ... A: if r0 != 0x0 goto line C B: ... C: if r0 != 0x0 goto line D D: ... ... such that addresses A and C collide to the same CPU branch prediction entries in the PHT (pattern history table) as those of the BPF program's lines 2 and 4, respectively. A non-privileged attacker could simply brute force such collisions in the PHT until observing the attack succeeding. Alternative methods to mistrain the branch predictor are also possible that avoid brute forcing the collisions in the PHT. A reliable attack has been demonstrated, for example, using the following crafted program: // r0 = pointer to a [control] map array entry // r7 = *(u64 *)(r0 + 0), training/attack phase // r8 = *(u64 *)(r0 + 8), oob address // [...] // r0 = pointer to a [data] map array entry 1: if r7 == 0x3 goto line 3 2: r8 = r0 // crafted sequence of conditional jumps to separate the conditional // branch in line 193 from the current execution flow 3: if r0 != 0x0 goto line 5 4: if r0 == 0x0 goto exit 5: if r0 != 0x0 goto line 7 6: if r0 == 0x0 goto exit [...] 187: if r0 != 0x0 goto line 189 188: if r0 == 0x0 goto exit // load any slowly-loaded value (due to cache miss in phase 3) ... 189: r3 = *(u64 *)(r0 + 0x1200) // ... and turn it into known zero for verifier, while preserving slowly- // loaded dependency when executing: 190: r3 &= 1 191: r3 &= 2 // speculatively bypassed phase dependency 192: r7 += r3 193: if r7 == 0x3 goto exit 194: r4 = *(u8 *)(r8 + 0) // leak r4 As can be seen, in training phase (phase != 0x3), the condition in line 1 turns into false and therefore r8 with the oob address is overridden with the valid map value address, which in line 194 we can read out without issues. However, in attack phase, line 2 is skipped, and due to the cache miss in line 189 where the map value is (zeroed and later) added to the phase register, the condition in line 193 takes the fall-through path due to prior branch predictor training, where under speculation, it'll load the byte at oob address r8 (unknown scalar type at that point) which could then be leaked via side-channel. One way to mitigate these is to 'branch off' an unreachable path, meaning, the current verification path keeps following the is_branch_taken() path and we push the other branch to the verification stack. Given this is unreachable from the non-speculative domain, this branch's vstate is explicitly marked as speculative. This is needed for two reasons: i) if this path is solely seen from speculative execution, then we later on still want the dead code elimination to kick in in order to sanitize these instructions with jmp-1s, and ii) to ensure that paths walked in the non-speculative domain are not pruned from earlier walks of paths walked in the speculative domain. Additionally, for robustness, we mark the registers which have been part of the conditional as unknown in the speculative path given there should be no assumptions made on their content. The fix in here mitigates type confusion attacks described earlier due to i) all code paths in the BPF program being explored and ii) existing verifier logic already ensuring that given memory access instruction references one specific data structure. An alternative to this fix that has also been looked at in this scope was to mark aux->alu_state at the jump instruction with a BPF_JMP_TAKEN state as well as direction encoding (always-goto, always-fallthrough, unknown), such that mixing of different always-* directions themselves as well as mixing of always-* with unknown directions would cause a program rejection by the verifier, e.g. programs with constructs like 'if ([...]) { x = 0; } else { x = 1; }' with subsequent 'if (x == 1) { [...] }'. For unprivileged, this would result in only single direction always-* taken paths, and unknown taken paths being allowed, such that the former could be patched from a conditional jump to an unconditional jump (ja). Compared to this approach here, it would have two downsides: i) valid programs that otherwise are not performing any pointer arithmetic, etc, would potentially be rejected/broken, and ii) we are required to turn off path pruning for unprivileged, where both can be avoided in this work through pushing the invalid branch to the verification stack. The issue was originally discovered by Adam and Ofek, and later independently discovered and reported as a result of Benedict and Piotr's research work. Fixes: b2157399cc98 ("bpf: prevent out-of-bounds speculation") Reported-by: Adam Morrison <mad@cs.tau.ac.il> Reported-by: Ofek Kirzner <ofekkir@gmail.com> Reported-by: Benedict Schlueter <benedict.schlueter@rub.de> Reported-by: Piotr Krysiuk <piotras@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Reviewed-by: Benedict Schlueter <benedict.schlueter@rub.de> Reviewed-by: Piotr Krysiuk <piotras@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> 14 June 2021, 21:06:10 UTC
fe9a5ca bpf: Do not mark insn as seen under speculative path verification ... in such circumstances, we do not want to mark the instruction as seen given the goal is still to jmp-1 rewrite/sanitize dead code, if it is not reachable from the non-speculative path verification. We do however want to verify it for safety regardless. With the patch as-is all the insns that have been marked as seen before the patch will also be marked as seen after the patch (just with a potentially different non-zero count). An upcoming patch will also verify paths that are unreachable in the non-speculative domain, hence this extension is needed. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Reviewed-by: Benedict Schlueter <benedict.schlueter@rub.de> Reviewed-by: Piotr Krysiuk <piotras@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> 14 June 2021, 21:06:06 UTC
d203b0f bpf: Inherit expanded/patched seen count from old aux data Instead of relying on current env->pass_cnt, use the seen count from the old aux data in adjust_insn_aux_data(), and expand it to the new range of patched instructions. This change is valid given we always expand 1:n with n>=1, so what applies to the old/original instruction needs to apply for the replacement as well. Not relying on env->pass_cnt is a prerequisite for a later change where we want to avoid marking an instruction seen when verified under speculative execution path. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Reviewed-by: Benedict Schlueter <benedict.schlueter@rub.de> Reviewed-by: Piotr Krysiuk <piotras@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> 14 June 2021, 21:06:00 UTC
45deacc Merge tag 'for-net-2021-06-14' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - Fix crash on SMP when debug is enabled ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 14 June 2021, 21:00:57 UTC
995fca1 Bluetooth: SMP: Fix crash when receiving new connection when debug is enabled When receiving a new connection pchan->conn won't be initialized so the code cannot use bt_dev_dbg as the pointer to hci_dev won't be accessible. Fixes: 2e1614f7d61e4 ("Bluetooth: SMP: Convert BT_ERR/BT_DBG to bt_dev_err/bt_dev_dbg") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> 14 June 2021, 20:16:27 UTC
ad9d24c net: qrtr: fix OOB Read in qrtr_endpoint_post Syzbot reported slab-out-of-bounds Read in qrtr_endpoint_post. The problem was in wrong _size_ type: if (len != ALIGN(size, 4) + hdrlen) goto err; If size from qrtr_hdr is 4294967293 (0xfffffffd), the result of ALIGN(size, 4) will be 0. In case of len == hdrlen and size == 4294967293 in header this check won't fail and skb_put_data(skb, data + hdrlen, size); will read out of bound from data, which is hdrlen allocated block. Fixes: 194ccc88297a ("net: qrtr: Support decoding incoming v2 packets") Reported-and-tested-by: syzbot+1917d778024161609247@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> 14 June 2021, 20:01:26 UTC
b87b04f ipv4: Fix device used for dst_alloc with local routes Oliver reported a use case where deleting a VRF device can hang waiting for the refcnt to drop to 0. The root cause is that the dst is allocated against the VRF device but cached on the loopback device. The use case (added to the selftests) has an implicit VRF crossing due to the ordering of the FIB rules (lookup local is before the l3mdev rule, but the problem occurs even if the FIB rules are re-ordered with local after l3mdev because the VRF table does not have a default route to terminate the lookup). The end result is is that the FIB lookup returns the loopback device as the nexthop, but the ingress device is in a VRF. The mismatch causes the dst alloc against the VRF device but then cached on the loopback. The fix is to bring the trick used for IPv6 (see ip6_rt_get_dev_rcu): pick the dst alloc device based the fib lookup result but with checks that the result has a nexthop device (e.g., not an unreachable or prohibit entry). Fixes: f5a0aab84b74 ("net: ipv4: dst for local input routes should use l3mdev if relevant") Reported-by: Oliver Herms <oliver.peter.herms@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> 14 June 2021, 19:30:53 UTC
58af3d3 net: caif: fix memory leak in ldisc_open Syzbot reported memory leak in tty_init_dev(). The problem was in unputted tty in ldisc_open() static int ldisc_open(struct tty_struct *tty) { ... ser->tty = tty_kref_get(tty); ... result = register_netdevice(dev); if (result) { rtnl_unlock(); free_netdev(dev); return -ENODEV; } ... } Ser pointer is netdev private_data, so after free_netdev() this pointer goes away with unputted tty reference. So, fix it by adding tty_kref_put() before freeing netdev. Reported-and-tested-by: syzbot+f303e045423e617d2cad@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> 14 June 2021, 19:28:16 UTC
09427c1 cxgb4: fix wrong ethtool n-tuple rule lookup The TID returned during successful filter creation is relative to the region in which the filter is created. Using it directly always returns Hi Prio/Normal filter region's entry for the first couple of entries, even though the rule is actually inserted in Hash region. Fix by analyzing in which region the filter has been inserted and save the absolute TID to be used for lookup later. Fixes: db43b30cd89c ("cxgb4: add ethtool n-tuple filter deletion") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net> 14 June 2021, 19:17:57 UTC
49a10c7 netxen_nic: Fix an error handling path in 'netxen_nic_probe()' If an error occurs after a 'pci_enable_pcie_error_reporting()' call, it must be undone by a corresponding 'pci_disable_pcie_error_reporting()' call, as already done in the remove function. Fixes: e87ad5539343 ("netxen: support pci error handlers") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net> 14 June 2021, 19:17:01 UTC
cb33766 qlcnic: Fix an error handling path in 'qlcnic_probe()' If an error occurs after a 'pci_enable_pcie_error_reporting()' call, it must be undone by a corresponding 'pci_disable_pcie_error_reporting()' call, as already done in the remove function. Fixes: 451724c821c1 ("qlcnic: aer support") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net> 14 June 2021, 19:16:07 UTC
e175aef ethtool: strset: fix message length calculation Outer nest for ETHTOOL_A_STRSET_STRINGSETS is not accounted for. This may result in ETHTOOL_MSG_STRSET_GET producing a warning like: calculated message payload length (684) not sufficient WARNING: CPU: 0 PID: 30967 at net/ethtool/netlink.c:369 ethnl_default_doit+0x87a/0xa20 and a splat. As usually with such warnings three conditions must be met for the warning to trigger: - there must be no skb size rounding up (e.g. reply_size of 684); - string set must be per-device (so that the header gets populated); - the device name must be at least 12 characters long. all in all with current user space it looks like reading priv flags is the only place this could potentially happen. Or with syzbot :) Reported-by: syzbot+59aa77b92d06cd5a54f2@syzkaller.appspotmail.com Fixes: 71921690f974 ("ethtool: provide string sets with STRSET_GET request") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> 14 June 2021, 19:14:24 UTC
994c393 net: qualcomm: rmnet: don't over-count statistics The purpose of the loop using u64_stats_fetch_*_irq() is to ensure statistics on a given CPU are collected atomically. If one of the statistics values gets updated within the begin/retry window, the loop will run again. Currently the statistics totals are updated inside that window. This means that if the loop ever retries, the statistics for the CPU will be counted more than once. Fix this by taking a snapshot of a CPU's statistics inside the protected window, and then updating the counters with the snapshot values after exiting the loop. (Also add a newline at the end of this file...) Fixes: 192c4b5d48f2a ("net: qualcomm: rmnet: Add support for 64 bit stats") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> 14 June 2021, 19:13:38 UTC
4f667b8 sch_cake: revise docs for RFC 8622 LE PHB support Commit b8392808eb3fc28e ("sch_cake: add RFC 8622 LE PHB support to CAKE diffserv handling") added the LE mark to the Bulk tin. Update the comments to reflect the change. Signed-off-by: Tyson Moore <tyson@tyson.me> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: David S. Miller <davem@davemloft.net> 14 June 2021, 19:10:57 UTC
ea6932d net: make get_net_ns return error if NET_NS is disabled There is a panic in socket ioctl cmd SIOCGSKNS when NET_NS is not enabled. The reason is that nsfs tries to access ns->ops but the proc_ns_operations is not implemented in this case. [7.670023] Unable to handle kernel NULL pointer dereference at virtual address 00000010 [7.670268] pgd = 32b54000 [7.670544] [00000010] *pgd=00000000 [7.671861] Internal error: Oops: 5 [#1] SMP ARM [7.672315] Modules linked in: [7.672918] CPU: 0 PID: 1 Comm: systemd Not tainted 5.13.0-rc3-00375-g6799d4f2da49 #16 [7.673309] Hardware name: Generic DT based system [7.673642] PC is at nsfs_evict+0x24/0x30 [7.674486] LR is at clear_inode+0x20/0x9c The same to tun SIOCGSKNS command. To fix this problem, we make get_net_ns() return -EINVAL when NET_NS is disabled. Meanwhile move it to right place net/core/net_namespace.c. Signed-off-by: Changbin Du <changbin.du@gmail.com> Fixes: c62cce2caee5 ("net: add an ioctl to get a socket network namespace") Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: David Laight <David.Laight@ACULAB.COM> Cc: Christian Brauner <christian.brauner@ubuntu.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net> 12 June 2021, 20:13:08 UTC
1adb20f net: stmmac: dwmac1000: Fix extended MAC address registers definition The register starts from 0x800 is the 16th MAC address register rather than the first one. Fixes: cffb13f4d6fb ("stmmac: extend mac addr reg and fix perfect filering") Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Signed-off-by: David S. Miller <davem@davemloft.net> 11 June 2021, 20:05:55 UTC
f4cdcae Merge branch 'cxgb4-fixes' Rahul Lakkireddy says: ==================== cxgb4: bug fixes for ethtool flash ops This series of patches add bug fixes in ethtool flash operations. Patch 1 fixes an endianness issue when writing boot image to flash after the device ID has been updated. Patch 2 fixes sleep in atomic when writing PHY firmware to flash. Patch 3 fixes issue with PHY firmware image not getting written to flash when chip is still running. -==================== Signed-off-by: David S. Miller <davem@davemloft.net> 11 June 2021, 18:15:01 UTC
6d29754 cxgb4: halt chip before flashing PHY firmware image When using firmware-assisted PHY firmware image write to flash, halt the chip before beginning the flash write operation to allow the running firmware to store the image persistently. Otherwise, the running firmware will only store the PHY image in local on-chip RAM, which will be lost after next reset. Fixes: 4ee339e1e92a ("cxgb4: add support to flash PHY image") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net> 11 June 2021, 18:15:00 UTC
f046bd0 cxgb4: fix sleep in atomic when flashing PHY firmware Before writing new PHY firmware to on-chip memory, driver queries firmware for current running PHY firmware version, which can result in sleep waiting for reply. So, move spinlock closer to the actual on-chip memory write operation, instead of taking it at the callers. Fixes: 5fff701c838e ("cxgb4: always sync access when flashing PHY firmware") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net> 11 June 2021, 18:15:00 UTC
42a2039 cxgb4: fix endianness when flashing boot image Boot images are copied to memory and updated with current underlying device ID before flashing them to adapter. Ensure the updated images are always flashed in Big Endian to allow the firmware to read the new images during boot properly. Fixes: 550883558f17 ("cxgb4: add support to flash boot image") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net> 11 June 2021, 18:15:00 UTC
33e3814 alx: Fix an error handling path in 'alx_probe()' If an error occurs after a 'pci_enable_pcie_error_reporting()' call, it must be undone by a corresponding 'pci_disable_pcie_error_reporting()' call, as already done in the remove function. Fixes: ab69bde6b2e9 ("alx: add a simple AR816x/AR817x device driver") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net> 11 June 2021, 18:12:54 UTC
da9ef50 net: phy: dp83867: perform soft reset and retain established link Current logic is performing hard reset and causing the programmed registers to be wiped out. as per datasheet: https://www.ti.com/lit/ds/symlink/dp83867cr.pdf 8.6.26 Control Register (CTRL) do SW_RESTART to perform a reset not including the registers, If performed when link is already present, it will drop the link and trigger re-auto negotiation. Signed-off-by: Praneeth Bajjuri <praneeth@ti.com> Signed-off-by: Geet Modi <geet.modi@ti.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> 11 June 2021, 17:13:03 UTC
232e368 Merge branch 'mptcp-fixes' Mat Martineau says: ==================== mptcp: More v5.13 fixes Here's another batch of MPTCP fixes for v5.13. Patch 1 cleans up memory accounting between the MPTCP-level socket and the subflows to more reliably transfer forward allocated memory under pressure. Patch 2 wakes up socket readers more reliably. Patch 3 changes a WARN_ONCE to a pr_debug. Patch 4 changes the selftests to only use syncookies in test cases where they do not cause spurious failures. Patch 5 modifies socket error reporting to avoid a possible soft lockup. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 10 June 2021, 23:47:45 UTC
499ada5 mptcp: fix soft lookup in subflow_error_report() Maxim reported a soft lookup in subflow_error_report(): watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:0] RIP: 0010:native_queued_spin_lock_slowpath RSP: 0018:ffffa859c0003bc0 EFLAGS: 00000202 RAX: 0000000000000101 RBX: 0000000000000001 RCX: 0000000000000000 RDX: ffff9195c2772d88 RSI: 0000000000000000 RDI: ffff9195c2772d88 RBP: ffff9195c2772d00 R08: 00000000000067b0 R09: c6e31da9eb1e44f4 R10: ffff9195ef379700 R11: ffff9195edb50710 R12: ffff9195c2772d88 R13: ffff9195f500e3d0 R14: ffff9195ef379700 R15: ffff9195ef379700 FS: 0000000000000000(0000) GS:ffff91961f400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000c000407000 CR3: 0000000002988000 CR4: 00000000000006f0 Call Trace: <IRQ> _raw_spin_lock_bh subflow_error_report mptcp_subflow_data_available __mptcp_move_skbs_from_subflow mptcp_data_ready tcp_data_queue tcp_rcv_established tcp_v4_do_rcv tcp_v4_rcv ip_protocol_deliver_rcu ip_local_deliver_finish __netif_receive_skb_one_core netif_receive_skb rtl8139_poll 8139too __napi_poll net_rx_action __do_softirq __irq_exit_rcu common_interrupt </IRQ> The calling function - mptcp_subflow_data_available() - can be invoked from different contexts: - plain ssk socket lock - ssk socket lock + mptcp_data_lock - ssk socket lock + mptcp_data_lock + msk socket lock. Since subflow_error_report() tries to acquire the mptcp_data_lock, the latter two call chains will cause soft lookup. This change addresses the issue moving the error reporting call to outer functions, where the held locks list is known and the we can acquire only the needed one. Reported-by: Maxim Galaganov <max@internet.ru> Fixes: 15cc10453398 ("mptcp: deliver ssk errors to msk") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/199 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> 10 June 2021, 23:47:45 UTC
2395da0 selftests: mptcp: enable syncookie only in absence of reorders Syncookie validation may fail for OoO packets, causing spurious resets and self-tests failures, so let's force syncookie only for tests iteration with no OoO. Fixes: fed61c4b584c ("selftests: mptcp: make 2nd net namespace use tcp syn cookies unconditionally") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/198 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> 10 June 2021, 23:47:45 UTC
61e7102 mptcp: do not warn on bad input from the network warn_bad_map() produces a kernel WARN on bad input coming from the network. Use pr_debug() to avoid spamming the system log. Additionally, when the right bound check fails, warn_bad_map() reports the wrong ssn value, let's fix it. Fixes: 648ef4b88673 ("mptcp: Implement MPTCP receive path") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/107 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> 10 June 2021, 23:47:45 UTC
99d1055 mptcp: wake-up readers only for in sequence data Currently we rely on the subflow->data_avail field, which is subject to races: ssk1 skb len = 500 DSS(seq=1, len=1000, off=0) # data_avail == MPTCP_SUBFLOW_DATA_AVAIL ssk2 skb len = 500 DSS(seq = 501, len=1000) # data_avail == MPTCP_SUBFLOW_DATA_AVAIL ssk1 skb len = 500 DSS(seq = 1, len=1000, off =500) # still data_avail == MPTCP_SUBFLOW_DATA_AVAIL, # as the skb is covered by a pre-existing map, # which was in-sequence at reception time. Instead we can explicitly check if some has been received in-sequence, propagating the info from __mptcp_move_skbs_from_subflow(). Additionally add the 'ONCE' annotation to the 'data_avail' memory access, as msk will read it outside the subflow socket lock. Fixes: 648ef4b88673 ("mptcp: Implement MPTCP receive path") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> 10 June 2021, 23:47:44 UTC
72f9613 mptcp: try harder to borrow memory from subflow under pressure If the host is under sever memory pressure, and RX forward memory allocation for the msk fails, we try to borrow the required memory from the ingress subflow. The current attempt is a bit flaky: if skb->truesize is less than SK_MEM_QUANTUM, the ssk will not release any memory, and the next schedule will fail again. Instead, directly move the required amount of pages from the ssk to the msk, if available Fixes: 9c3f94e1681b ("mptcp: add missing memory scheduling in the rx path") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> 10 June 2021, 23:47:44 UTC
22488e4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Fix a crash when stateful expression with its own gc callback is used in a set definition. 2) Skip IPv6 packets from any link-local address in IPv6 fib expression. Add a selftest for this scenario, from Florian Westphal. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 10 June 2021, 21:33:56 UTC
0280f42 Merge branch 'tcp-options-oob-fixes' Maxim Mikityanskiy says: ==================== Fix out of bounds when parsing TCP options This series fixes out-of-bounds access in various places in the kernel where parsing of TCP options takes place. Fortunately, many more occurrences don't have this bug. v2 changes: synproxy: Added an early return when length < 0 to avoid calling skb_header_pointer with negative length. sch_cake: Added doff validation to avoid parsing garbage. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 10 June 2021, 21:26:18 UTC
ba91c49 sch_cake: Fix out of bounds when parsing TCP options and header The TCP option parser in cake qdisc (cake_get_tcpopt and cake_tcph_may_drop) could read one byte out of bounds. When the length is 1, the execution flow gets into the loop, reads one byte of the opcode, and if the opcode is neither TCPOPT_EOL nor TCPOPT_NOP, it reads one more byte, which exceeds the length of 1. This fix is inspired by commit 9609dad263f8 ("ipv4: tcp_input: fix stack out of bounds when parsing TCP options."). v2 changes: Added doff validation in cake_get_tcphdr to avoid parsing garbage as TCP header. Although it wasn't strictly an out-of-bounds access (memory was allocated), garbage values could be read where CAKE expected the TCP header if doff was smaller than 5. Cc: Young Xiao <92siuyang@gmail.com> Fixes: 8b7138814f29 ("sch_cake: Add optional ACK filter") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: David S. Miller <davem@davemloft.net> 10 June 2021, 21:26:18 UTC
07718be mptcp: Fix out of bounds when parsing TCP options The TCP option parser in mptcp (mptcp_get_options) could read one byte out of bounds. When the length is 1, the execution flow gets into the loop, reads one byte of the opcode, and if the opcode is neither TCPOPT_EOL nor TCPOPT_NOP, it reads one more byte, which exceeds the length of 1. This fix is inspired by commit 9609dad263f8 ("ipv4: tcp_input: fix stack out of bounds when parsing TCP options."). Cc: Young Xiao <92siuyang@gmail.com> Fixes: cec37a6e41aa ("mptcp: Handle MP_CAPABLE options for outgoing connections") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> 10 June 2021, 21:26:18 UTC
5fc177a netfilter: synproxy: Fix out of bounds when parsing TCP options The TCP option parser in synproxy (synproxy_parse_options) could read one byte out of bounds. When the length is 1, the execution flow gets into the loop, reads one byte of the opcode, and if the opcode is neither TCPOPT_EOL nor TCPOPT_NOP, it reads one more byte, which exceeds the length of 1. This fix is inspired by commit 9609dad263f8 ("ipv4: tcp_input: fix stack out of bounds when parsing TCP options."). v2 changes: Added an early return when length < 0 to avoid calling skb_header_pointer with negative length. Cc: Young Xiao <92siuyang@gmail.com> Fixes: 48b1de4c110a ("netfilter: add SYNPROXY core/target") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net> 10 June 2021, 21:26:18 UTC
d1b5bee net/packet: annotate data race in packet_sendmsg() There is a known race in packet_sendmsg(), addressed in commit 32d3182cd2cd ("net/packet: fix race in tpacket_snd()") Now we have data_race(), we can use it to avoid a future KCSAN warning, as syzbot loves stressing af_packet sockets :) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> 10 June 2021, 21:12:54 UTC
b71eaed inet: annotate date races around sk->sk_txhash UDP sendmsg() path can be lockless, it is possible for another thread to re-connect an change sk->sk_txhash under us. There is no serious impact, but we can use READ_ONCE()/WRITE_ONCE() pair to document the race. BUG: KCSAN: data-race in __ip4_datagram_connect / skb_set_owner_w write to 0xffff88813397920c of 4 bytes by task 30997 on cpu 1: sk_set_txhash include/net/sock.h:1937 [inline] __ip4_datagram_connect+0x69e/0x710 net/ipv4/datagram.c:75 __ip6_datagram_connect+0x551/0x840 net/ipv6/datagram.c:189 ip6_datagram_connect+0x2a/0x40 net/ipv6/datagram.c:272 inet_dgram_connect+0xfd/0x180 net/ipv4/af_inet.c:580 __sys_connect_file net/socket.c:1837 [inline] __sys_connect+0x245/0x280 net/socket.c:1854 __do_sys_connect net/socket.c:1864 [inline] __se_sys_connect net/socket.c:1861 [inline] __x64_sys_connect+0x3d/0x50 net/socket.c:1861 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff88813397920c of 4 bytes by task 31039 on cpu 0: skb_set_hash_from_sk include/net/sock.h:2211 [inline] skb_set_owner_w+0x118/0x220 net/core/sock.c:2101 sock_alloc_send_pskb+0x452/0x4e0 net/core/sock.c:2359 sock_alloc_send_skb+0x2d/0x40 net/core/sock.c:2373 __ip6_append_data+0x1743/0x21a0 net/ipv6/ip6_output.c:1621 ip6_make_skb+0x258/0x420 net/ipv6/ip6_output.c:1983 udpv6_sendmsg+0x160a/0x16b0 net/ipv6/udp.c:1527 inet6_sendmsg+0x5f/0x80 net/ipv6/af_inet6.c:642 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg net/socket.c:674 [inline] ____sys_sendmsg+0x360/0x4d0 net/socket.c:2350 ___sys_sendmsg net/socket.c:2404 [inline] __sys_sendmmsg+0x315/0x4b0 net/socket.c:2490 __do_sys_sendmmsg net/socket.c:2519 [inline] __se_sys_sendmmsg net/socket.c:2516 [inline] __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2516 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0xbca3c43d -> 0xfdb309e0 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 31039 Comm: syz-executor.2 Not tainted 5.13.0-rc3-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> 10 June 2021, 21:12:54 UTC
f13ef10 net: annotate data race in sock_error() sock_error() is known to be racy. The code avoids an atomic operation is sk_err is zero, and this field could be changed under us, this is fine. Sysbot reported: BUG: KCSAN: data-race in sock_alloc_send_pskb / unix_release_sock write to 0xffff888131855630 of 4 bytes by task 9365 on cpu 1: unix_release_sock+0x2e9/0x6e0 net/unix/af_unix.c:550 unix_release+0x2f/0x50 net/unix/af_unix.c:859 __sock_release net/socket.c:599 [inline] sock_close+0x6c/0x150 net/socket.c:1258 __fput+0x25b/0x4e0 fs/file_table.c:280 ____fput+0x11/0x20 fs/file_table.c:313 task_work_run+0xae/0x130 kernel/task_work.c:164 tracehook_notify_resume include/linux/tracehook.h:189 [inline] exit_to_user_mode_loop kernel/entry/common.c:174 [inline] exit_to_user_mode_prepare+0x156/0x190 kernel/entry/common.c:208 __syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline] syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:301 do_syscall_64+0x56/0x90 arch/x86/entry/common.c:57 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff888131855630 of 4 bytes by task 9385 on cpu 0: sock_error include/net/sock.h:2269 [inline] sock_alloc_send_pskb+0xe4/0x4e0 net/core/sock.c:2336 unix_dgram_sendmsg+0x478/0x1610 net/unix/af_unix.c:1671 unix_seqpacket_sendmsg+0xc2/0x100 net/unix/af_unix.c:2055 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg net/socket.c:674 [inline] ____sys_sendmsg+0x360/0x4d0 net/socket.c:2350 __sys_sendmsg_sock+0x25/0x30 net/socket.c:2416 io_sendmsg fs/io_uring.c:4367 [inline] io_issue_sqe+0x231a/0x6750 fs/io_uring.c:6135 __io_queue_sqe+0xe9/0x360 fs/io_uring.c:6414 __io_req_task_submit fs/io_uring.c:2039 [inline] io_async_task_func+0x312/0x590 fs/io_uring.c:5074 __tctx_task_work fs/io_uring.c:1910 [inline] tctx_task_work+0x1d4/0x3d0 fs/io_uring.c:1924 task_work_run+0xae/0x130 kernel/task_work.c:164 tracehook_notify_signal include/linux/tracehook.h:212 [inline] handle_signal_work kernel/entry/common.c:145 [inline] exit_to_user_mode_loop kernel/entry/common.c:171 [inline] exit_to_user_mode_prepare+0xf8/0x190 kernel/entry/common.c:208 __syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline] syscall_exit_to_user_mode+0x20/0x40 kernel/entry/common.c:301 do_syscall_64+0x56/0x90 arch/x86/entry/common.c:57 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0x00000000 -> 0x00000068 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 9385 Comm: syz-executor.3 Not tainted 5.13.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> 10 June 2021, 21:12:54 UTC
172947a Merge branch 'bridge-egress-fixes' Nikolay Aleksandrov says: ==================== net: bridge: vlan tunnel egress path fixes These two fixes take care of tunnel_dst problems in the vlan tunnel egress path. Patch 01 fixes a null ptr deref due to the lockless use of tunnel_dst pointer without checking it first, and patch 02 fixes a use-after-free issue due to wrong dst refcounting (dst_clone() -> dst_hold_safe()). Both fix the same commit and should be queued for stable backports: Fixes: 11538d039ac6 ("bridge: vlan dst_metadata hooks in ingress and egress paths") v2: no changes, added stable list to CC ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 10 June 2021, 21:06:43 UTC
cfc579f net: bridge: fix vlan tunnel dst refcnt when egressing The egress tunnel code uses dst_clone() and directly sets the result which is wrong because the entry might have 0 refcnt or be already deleted, causing number of problems. It also triggers the WARN_ON() in dst_hold()[1] when a refcnt couldn't be taken. Fix it by using dst_hold_safe() and checking if a reference was actually taken before setting the dst. [1] dmesg WARN_ON log and following refcnt errors WARNING: CPU: 5 PID: 38 at include/net/dst.h:230 br_handle_egress_vlan_tunnel+0x10b/0x134 [bridge] Modules linked in: 8021q garp mrp bridge stp llc bonding ipv6 virtio_net CPU: 5 PID: 38 Comm: ksoftirqd/5 Kdump: loaded Tainted: G W 5.13.0-rc3+ #360 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014 RIP: 0010:br_handle_egress_vlan_tunnel+0x10b/0x134 [bridge] Code: e8 85 bc 01 e1 45 84 f6 74 90 45 31 f6 85 db 48 c7 c7 a0 02 19 a0 41 0f 94 c6 31 c9 31 d2 44 89 f6 e8 64 bc 01 e1 85 db 75 02 <0f> 0b 31 c9 31 d2 44 89 f6 48 c7 c7 70 02 19 a0 e8 4b bc 01 e1 49 RSP: 0018:ffff8881003d39e8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffffa01902a0 RBP: ffff8881040c6700 R08: 0000000000000000 R09: 0000000000000001 R10: 2ce93d0054fe0d00 R11: 54fe0d00000e0000 R12: ffff888109515000 R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000401 FS: 0000000000000000(0000) GS:ffff88822bf40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f42ba70f030 CR3: 0000000109926000 CR4: 00000000000006e0 Call Trace: br_handle_vlan+0xbc/0xca [bridge] __br_forward+0x23/0x164 [bridge] deliver_clone+0x41/0x48 [bridge] br_handle_frame_finish+0x36f/0x3aa [bridge] ? skb_dst+0x2e/0x38 [bridge] ? br_handle_ingress_vlan_tunnel+0x3e/0x1c8 [bridge] ? br_handle_frame_finish+0x3aa/0x3aa [bridge] br_handle_frame+0x2c3/0x377 [bridge] ? __skb_pull+0x33/0x51 ? vlan_do_receive+0x4f/0x36a ? br_handle_frame_finish+0x3aa/0x3aa [bridge] __netif_receive_skb_core+0x539/0x7c6 ? __list_del_entry_valid+0x16e/0x1c2 __netif_receive_skb_list_core+0x6d/0xd6 netif_receive_skb_list_internal+0x1d9/0x1fa gro_normal_list+0x22/0x3e dev_gro_receive+0x55b/0x600 ? detach_buf_split+0x58/0x140 napi_gro_receive+0x94/0x12e virtnet_poll+0x15d/0x315 [virtio_net] __napi_poll+0x2c/0x1c9 net_rx_action+0xe6/0x1fb __do_softirq+0x115/0x2d8 run_ksoftirqd+0x18/0x20 smpboot_thread_fn+0x183/0x19c ? smpboot_unregister_percpu_thread+0x66/0x66 kthread+0x10a/0x10f ? kthread_mod_delayed_work+0xb6/0xb6 ret_from_fork+0x22/0x30 ---[ end trace 49f61b07f775fd2b ]--- dst_release: dst:00000000c02d677a refcnt:-1 dst_release underflow Cc: stable@vger.kernel.org Fixes: 11538d039ac6 ("bridge: vlan dst_metadata hooks in ingress and egress paths") Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> 10 June 2021, 21:06:43 UTC
58e2071 net: bridge: fix vlan tunnel dst null pointer dereference This patch fixes a tunnel_dst null pointer dereference due to lockless access in the tunnel egress path. When deleting a vlan tunnel the tunnel_dst pointer is set to NULL without waiting a grace period (i.e. while it's still usable) and packets egressing are dereferencing it without checking. Use READ/WRITE_ONCE to annotate the lockless use of tunnel_id, use RCU for accessing tunnel_dst and make sure it is read only once and checked in the egress path. The dst is already properly RCU protected so we don't need to do anything fancy than to make sure tunnel_id and tunnel_dst are read only once and checked in the egress path. Cc: stable@vger.kernel.org Fixes: 11538d039ac6 ("bridge: vlan dst_metadata hooks in ingress and egress paths") Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> 10 June 2021, 21:06:43 UTC
9d44fa3 ping: Check return value of function 'ping_queue_rcv_skb' Function 'ping_queue_rcv_skb' not always return success, which will also return fail. If not check the wrong return value of it, lead to function `ping_rcv` return success. Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> 10 June 2021, 20:44:55 UTC
3bdd5ee skbuff: fix incorrect msg_zerocopy copy notifications msg_zerocopy signals if a send operation required copying with a flag in serr->ee.ee_code. This field can be incorrect as of the below commit, as a result of both structs uarg and serr pointing into the same skb->cb[]. uarg->zerocopy must be read before skb->cb[] is reinitialized to hold serr. Similar to other fields len, hi and lo, use a local variable to temporarily hold the value. This was not a problem before, when the value was passed as a function argument. Fixes: 75518851a2a0 ("skbuff: Push status and refcounts into sock_zerocopy_callback") Reported-by: Talal Ahmad <talalahmad@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> 10 June 2021, 20:39:57 UTC
388fa7f Merge tag 'mlx5-fixes-2021-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-fixes-2021-06-09 ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 10 June 2021, 20:38:46 UTC
54e1217 net/mlx5e: Block offload of outer header csum for GRE tunnel The device is able to offload either the outer header csum or inner header csum. The driver utilizes the inner csum offload. So, prohibit setting of tx-gre-csum-segmentation and let it be: off[fixed]. Fixes: 2729984149e6 ("net/mlx5e: Support TSO and TX checksum offloads for GRE tunnels") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> 10 June 2021, 00:20:06 UTC
6d6727d net/mlx5e: Block offload of outer header csum for UDP tunnels The device is able to offload either the outer header csum or inner header csum. The driver utilizes the inner csum offload. Hence, block setting of tx-udp_tnl-csum-segmentation and set it to off[fixed]. Fixes: b49663c8fb49 ("net/mlx5e: Add support for UDP tunnel segmentation with outer checksum offload") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> 10 June 2021, 00:20:06 UTC
7a54507 Revert "net/mlx5: Arm only EQs with EQEs" In the scenario described below, an EQ can remain in FIRED state which can result in missing an interrupt generation. The scenario: device mlx5_core driver ------ ---------------- EQ1.eqe generated EQ1.MSI-X sent EQ1.state = FIRED EQ2.eqe generated mlx5_irq() polls - eq1_eqes() arm eq1 polls - eq2_eqes() arm eq2 EQ2.MSI-X sent EQ2.state = FIRED mlx5_irq() polls - eq2_eqes() -- no eqes found driver skips EQ arming; ->EQ2 remains fired, misses generating interrupt. Hence, always arm the EQ by reverting the cited commit in fixes tag. Fixes: d894892dda25 ("net/mlx5: Arm only EQs with EQEs") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> 10 June 2021, 00:20:05 UTC
a6ee6f5 net/mlx5e: Fix select queue to consider SKBTX_HW_TSTAMP Steering packets to PTP-SQ should be done only if the SKB has SKBTX_HW_TSTAMP set in the tx_flags. While here, take the function into a header and inline it. Set the whole condition to select the PTP-SQ to unlikely. Fixes: 24c22dd0918b ("net/mlx5e: Add states to PTP channel") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> 10 June 2021, 00:20:05 UTC
9ae8c18 net/mlx5e: Don't update netdev RQs with PTP-RQ Since the driver opens the PTP-RQ under channel 0, it appears to the stack as if the SKB was received on rxq0. So from thew stack POV there are still the same number of RX queues. Fixes: 960fbfe222a4 ("net/mlx5e: Allow coexistence of CQE compression and HW TS PTP") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> 10 June 2021, 00:20:05 UTC
11f5ac3 net/mlx5e: Verify dev is present in get devlink port ndo When changing eswitch mode, the netdev is detached from the hardware resources. So verify dev is present in get devlink port ndo. Otherwise, we will hit the following panic: [241535.973539] RIP: 0010:__devlink_port_phys_port_name_get+0x13/0x1b0 [241535.976471] RSP: 0018:ffff9eaf0ae1b7c8 EFLAGS: 00010292 [241535.977471] RAX: 000000000002d370 RBX: 000000000002d370 RCX: 0000000000000000 [241535.978479] RDX: 0000000000000010 RSI: ffff9eaf0ae1b858 RDI: 000000000002d370 [241535.979482] RBP: ffff9eaf0ae1b7e0 R08: 000000000000002a R09: ffff8888d54d13da [241535.980486] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8888e6700000 [241535.981491] R13: ffff9eaf0ae1b858 R14: 0000000000000010 R15: 0000000000000000 [241535.982489] FS: 00007fd374ef3740(0000) GS:ffff88909ea00000(0000) knlGS:0000000000000000 [241535.983494] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [241535.984487] CR2: 000000000002d444 CR3: 000000089fd26006 CR4: 00000000003706e0 [241535.985502] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [241535.986499] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [241535.987477] Call Trace: [241535.988426] ? nla_put_64bit+0x71/0xa0 [241535.989368] devlink_compat_phys_port_name_get+0x50/0xa0 [241535.990312] dev_get_phys_port_name+0x4b/0x60 [241535.991252] rtnl_fill_ifinfo+0x57b/0xcb0 [241535.992192] rtnl_dump_ifinfo+0x58f/0x6d0 [241535.993123] ? ksize+0x14/0x20 [241535.994033] ? __alloc_skb+0xe8/0x250 [241535.994935] netlink_dump+0x17c/0x300 [241535.995821] netlink_recvmsg+0x1de/0x2c0 [241535.996677] sock_recvmsg+0x70/0x80 [241535.997518] ____sys_recvmsg+0x9b/0x1b0 [241535.998360] ? iovec_from_user+0x82/0x120 [241535.999202] ? __import_iovec+0x2c/0x130 [241536.000031] ___sys_recvmsg+0x94/0x130 [241536.000850] ? __handle_mm_fault+0x56d/0x6e0 [241536.001668] __sys_recvmsg+0x5f/0xb0 [241536.002464] ? syscall_enter_from_user_mode+0x2b/0x80 [241536.003242] __x64_sys_recvmsg+0x1f/0x30 [241536.004008] do_syscall_64+0x38/0x50 [241536.004767] entry_SYSCALL_64_after_hwframe+0x44/0xae [241536.005532] RIP: 0033:0x7fd375014f47 Fixes: 2ff349c5edfe ("net/mlx5e: Verify dev is present in some ndos") Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Chris Mi <cmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> 10 June 2021, 00:20:04 UTC
4aaf96a net/mlx5: DR, Don't use SW steering when RoCE is not supported SW steering uses RC QP to write/read to/from ICM, hence it's not supported when RoCE is not supported as well. Fixes: 70605ea545e8 ("net/mlx5: DR, Expose APIs for direct rule managing") Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> 10 June 2021, 00:20:04 UTC
c189716 net/mlx5: Consider RoCE cap before init RDMA resources Check if RoCE is supported by the device before enable it in the vport context and create all the RDMA steering objects. Fixes: 80f09dfc237f ("net/mlx5: Eswitch, enable RoCE loopback traffic") Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> 10 June 2021, 00:20:04 UTC
a3e5fd9 net/mlx5e: Fix page reclaim for dead peer hairpin When adding a hairpin flow, a firmware-side send queue is created for the peer net device, which claims some host memory pages for its internal ring buffer. If the peer net device is removed/unbound before the hairpin flow is deleted, then the send queue is not destroyed which leads to a stack trace on pci device remove: [ 748.005230] mlx5_core 0000:08:00.2: wait_func:1094:(pid 12985): MANAGE_PAGES(0x108) timeout. Will cause a leak of a command resource [ 748.005231] mlx5_core 0000:08:00.2: reclaim_pages:514:(pid 12985): failed reclaiming pages: err -110 [ 748.001835] mlx5_core 0000:08:00.2: mlx5_reclaim_root_pages:653:(pid 12985): failed reclaiming pages (-110) for func id 0x0 [ 748.002171] ------------[ cut here ]------------ [ 748.001177] FW pages counter is 4 after reclaiming all pages [ 748.001186] WARNING: CPU: 1 PID: 12985 at drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c:685 mlx5_reclaim_startup_pages+0x34b/0x460 [mlx5_core] [ +0.002771] Modules linked in: cls_flower mlx5_ib mlx5_core ptp pps_core act_mirred sch_ingress openvswitch nsh xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm ib_umad ib_ipoib iw_cm ib_cm ib_uverbs ib_core overlay fuse [last unloaded: pps_core] [ 748.007225] CPU: 1 PID: 12985 Comm: tee Not tainted 5.12.0+ #1 [ 748.001376] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 748.002315] RIP: 0010:mlx5_reclaim_startup_pages+0x34b/0x460 [mlx5_core] [ 748.001679] Code: 28 00 00 00 0f 85 22 01 00 00 48 81 c4 b0 00 00 00 31 c0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 c7 c7 40 cc 19 a1 e8 9f 71 0e e2 <0f> 0b e9 30 ff ff ff 48 c7 c7 a0 cc 19 a1 e8 8c 71 0e e2 0f 0b e9 [ 748.003781] RSP: 0018:ffff88815220faf8 EFLAGS: 00010286 [ 748.001149] RAX: 0000000000000000 RBX: ffff8881b4900280 RCX: 0000000000000000 [ 748.001445] RDX: 0000000000000027 RSI: 0000000000000004 RDI: ffffed102a441f51 [ 748.001614] RBP: 00000000000032b9 R08: 0000000000000001 R09: ffffed1054a15ee8 [ 748.001446] R10: ffff8882a50af73b R11: ffffed1054a15ee7 R12: fffffbfff07c1e30 [ 748.001447] R13: dffffc0000000000 R14: ffff8881b492cba8 R15: 0000000000000000 [ 748.001429] FS: 00007f58bd08b580(0000) GS:ffff8882a5080000(0000) knlGS:0000000000000000 [ 748.001695] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 748.001309] CR2: 000055a026351740 CR3: 00000001d3b48006 CR4: 0000000000370ea0 [ 748.001506] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 748.001483] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 748.001654] Call Trace: [ 748.000576] ? mlx5_satisfy_startup_pages+0x290/0x290 [mlx5_core] [ 748.001416] ? mlx5_cmd_teardown_hca+0xa2/0xd0 [mlx5_core] [ 748.001354] ? mlx5_cmd_init_hca+0x280/0x280 [mlx5_core] [ 748.001203] mlx5_function_teardown+0x30/0x60 [mlx5_core] [ 748.001275] mlx5_uninit_one+0xa7/0xc0 [mlx5_core] [ 748.001200] remove_one+0x5f/0xc0 [mlx5_core] [ 748.001075] pci_device_remove+0x9f/0x1d0 [ 748.000833] device_release_driver_internal+0x1e0/0x490 [ 748.001207] unbind_store+0x19f/0x200 [ 748.000942] ? sysfs_file_ops+0x170/0x170 [ 748.001000] kernfs_fop_write_iter+0x2bc/0x450 [ 748.000970] new_sync_write+0x373/0x610 [ 748.001124] ? new_sync_read+0x600/0x600 [ 748.001057] ? lock_acquire+0x4d6/0x700 [ 748.000908] ? lockdep_hardirqs_on_prepare+0x400/0x400 [ 748.001126] ? fd_install+0x1c9/0x4d0 [ 748.000951] vfs_write+0x4d0/0x800 [ 748.000804] ksys_write+0xf9/0x1d0 [ 748.000868] ? __x64_sys_read+0xb0/0xb0 [ 748.000811] ? filp_open+0x50/0x50 [ 748.000919] ? syscall_enter_from_user_mode+0x1d/0x50 [ 748.001223] do_syscall_64+0x3f/0x80 [ 748.000892] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 748.001026] RIP: 0033:0x7f58bcfb22f7 [ 748.000944] Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 [ 748.003925] RSP: 002b:00007fffd7f2aaa8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 748.001732] RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007f58bcfb22f7 [ 748.001426] RDX: 000000000000000d RSI: 00007fffd7f2abc0 RDI: 0000000000000003 [ 748.001746] RBP: 00007fffd7f2abc0 R08: 0000000000000000 R09: 0000000000000001 [ 748.001631] R10: 00000000000001b6 R11: 0000000000000246 R12: 000000000000000d [ 748.001537] R13: 00005597ac2c24a0 R14: 000000000000000d R15: 00007f58bd084700 [ 748.001564] irq event stamp: 0 [ 748.000787] hardirqs last enabled at (0): [<0000000000000000>] 0x0 [ 748.001399] hardirqs last disabled at (0): [<ffffffff813132cf>] copy_process+0x146f/0x5eb0 [ 748.001854] softirqs last enabled at (0): [<ffffffff8131330e>] copy_process+0x14ae/0x5eb0 [ 748.013431] softirqs last disabled at (0): [<0000000000000000>] 0x0 [ 748.001492] ---[ end trace a6fabd773d1c51ae ]--- Fix by destroying the send queue of a hairpin peer net device that is being removed/unbound, which returns the allocated ring buffer pages to the host. Fixes: 4d8fcf216c90 ("net/mlx5e: Avoid unbounded peer devices when unpairing TC hairpin rules") Signed-off-by: Dima Chumak <dchumak@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> 10 June 2021, 00:20:03 UTC
8ad893e net/mlx5e: Remove dependency in IPsec initialization flows Currently, IPsec feature is disabled because mlx5e_build_nic_netdev is required to be called after mlx5e_ipsec_init. This requirement is invalid as mlx5e_build_nic_netdev and mlx5e_ipsec_init initialize independent resources. Remove ipsec pointer check in mlx5e_build_nic_netdev so that the two functions can be called at any order. Fixes: 547eede070eb ("net/mlx5e: IPSec, Innova IPSec offload infrastructure") Signed-off-by: Huy Nguyen <huyn@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> 10 June 2021, 00:20:03 UTC
fb1a313 net/mlx5e: Fix use-after-free of encap entry in neigh update handler Function mlx5e_rep_neigh_update() wasn't updated to accommodate rtnl lock removal from TC filter update path and properly handle concurrent encap entry insertion/deletion which can lead to following use-after-free: [23827.464923] ================================================================== [23827.469446] BUG: KASAN: use-after-free in mlx5e_encap_take+0x72/0x140 [mlx5_core] [23827.470971] Read of size 4 at addr ffff8881d132228c by task kworker/u20:6/21635 [23827.472251] [23827.472615] CPU: 9 PID: 21635 Comm: kworker/u20:6 Not tainted 5.13.0-rc3+ #5 [23827.473788] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [23827.475639] Workqueue: mlx5e mlx5e_rep_neigh_update [mlx5_core] [23827.476731] Call Trace: [23827.477260] dump_stack+0xbb/0x107 [23827.477906] print_address_description.constprop.0+0x18/0x140 [23827.478896] ? mlx5e_encap_take+0x72/0x140 [mlx5_core] [23827.479879] ? mlx5e_encap_take+0x72/0x140 [mlx5_core] [23827.480905] kasan_report.cold+0x7c/0xd8 [23827.481701] ? mlx5e_encap_take+0x72/0x140 [mlx5_core] [23827.482744] kasan_check_range+0x145/0x1a0 [23827.493112] mlx5e_encap_take+0x72/0x140 [mlx5_core] [23827.494054] ? mlx5e_tc_tun_encap_info_equal_generic+0x140/0x140 [mlx5_core] [23827.495296] mlx5e_rep_neigh_update+0x41e/0x5e0 [mlx5_core] [23827.496338] ? mlx5e_rep_neigh_entry_release+0xb80/0xb80 [mlx5_core] [23827.497486] ? read_word_at_a_time+0xe/0x20 [23827.498250] ? strscpy+0xa0/0x2a0 [23827.498889] process_one_work+0x8ac/0x14e0 [23827.499638] ? lockdep_hardirqs_on_prepare+0x400/0x400 [23827.500537] ? pwq_dec_nr_in_flight+0x2c0/0x2c0 [23827.501359] ? rwlock_bug.part.0+0x90/0x90 [23827.502116] worker_thread+0x53b/0x1220 [23827.502831] ? process_one_work+0x14e0/0x14e0 [23827.503627] kthread+0x328/0x3f0 [23827.504254] ? _raw_spin_unlock_irq+0x24/0x40 [23827.505065] ? __kthread_bind_mask+0x90/0x90 [23827.505912] ret_from_fork+0x1f/0x30 [23827.506621] [23827.506987] Allocated by task 28248: [23827.507694] kasan_save_stack+0x1b/0x40 [23827.508476] __kasan_kmalloc+0x7c/0x90 [23827.509197] mlx5e_attach_encap+0xde1/0x1d40 [mlx5_core] [23827.510194] mlx5e_tc_add_fdb_flow+0x397/0xc40 [mlx5_core] [23827.511218] __mlx5e_add_fdb_flow+0x519/0xb30 [mlx5_core] [23827.512234] mlx5e_configure_flower+0x191c/0x4870 [mlx5_core] [23827.513298] tc_setup_cb_add+0x1d5/0x420 [23827.514023] fl_hw_replace_filter+0x382/0x6a0 [cls_flower] [23827.514975] fl_change+0x2ceb/0x4a51 [cls_flower] [23827.515821] tc_new_tfilter+0x89a/0x2070 [23827.516548] rtnetlink_rcv_msg+0x644/0x8c0 [23827.517300] netlink_rcv_skb+0x11d/0x340 [23827.518021] netlink_unicast+0x42b/0x700 [23827.518742] netlink_sendmsg+0x743/0xc20 [23827.519467] sock_sendmsg+0xb2/0xe0 [23827.520131] ____sys_sendmsg+0x590/0x770 [23827.520851] ___sys_sendmsg+0xd8/0x160 [23827.521552] __sys_sendmsg+0xb7/0x140 [23827.522238] do_syscall_64+0x3a/0x70 [23827.522907] entry_SYSCALL_64_after_hwframe+0x44/0xae [23827.523797] [23827.524163] Freed by task 25948: [23827.524780] kasan_save_stack+0x1b/0x40 [23827.525488] kasan_set_track+0x1c/0x30 [23827.526187] kasan_set_free_info+0x20/0x30 [23827.526968] __kasan_slab_free+0xed/0x130 [23827.527709] slab_free_freelist_hook+0xcf/0x1d0 [23827.528528] kmem_cache_free_bulk+0x33a/0x6e0 [23827.529317] kfree_rcu_work+0x55f/0xb70 [23827.530024] process_one_work+0x8ac/0x14e0 [23827.530770] worker_thread+0x53b/0x1220 [23827.531480] kthread+0x328/0x3f0 [23827.532114] ret_from_fork+0x1f/0x30 [23827.532785] [23827.533147] Last potentially related work creation: [23827.534007] kasan_save_stack+0x1b/0x40 [23827.534710] kasan_record_aux_stack+0xab/0xc0 [23827.535492] kvfree_call_rcu+0x31/0x7b0 [23827.536206] mlx5e_tc_del_fdb_flow+0x577/0xef0 [mlx5_core] [23827.537305] mlx5e_flow_put+0x49/0x80 [mlx5_core] [23827.538290] mlx5e_delete_flower+0x6d1/0xe60 [mlx5_core] [23827.539300] tc_setup_cb_destroy+0x18e/0x2f0 [23827.540144] fl_hw_destroy_filter+0x1d2/0x310 [cls_flower] [23827.541148] __fl_delete+0x4dc/0x660 [cls_flower] [23827.541985] fl_delete+0x97/0x160 [cls_flower] [23827.542782] tc_del_tfilter+0x7ab/0x13d0 [23827.543503] rtnetlink_rcv_msg+0x644/0x8c0 [23827.544257] netlink_rcv_skb+0x11d/0x340 [23827.544981] netlink_unicast+0x42b/0x700 [23827.545700] netlink_sendmsg+0x743/0xc20 [23827.546424] sock_sendmsg+0xb2/0xe0 [23827.547084] ____sys_sendmsg+0x590/0x770 [23827.547850] ___sys_sendmsg+0xd8/0x160 [23827.548606] __sys_sendmsg+0xb7/0x140 [23827.549303] do_syscall_64+0x3a/0x70 [23827.549969] entry_SYSCALL_64_after_hwframe+0x44/0xae [23827.550853] [23827.551217] The buggy address belongs to the object at ffff8881d1322200 [23827.551217] which belongs to the cache kmalloc-256 of size 256 [23827.553341] The buggy address is located 140 bytes inside of [23827.553341] 256-byte region [ffff8881d1322200, ffff8881d1322300) [23827.555747] The buggy address belongs to the page: [23827.556847] page:00000000898762aa refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1d1320 [23827.558651] head:00000000898762aa order:2 compound_mapcount:0 compound_pincount:0 [23827.559961] flags: 0x2ffff800010200(slab|head|node=0|zone=2|lastcpupid=0x1ffff) [23827.561243] raw: 002ffff800010200 dead000000000100 dead000000000122 ffff888100042b40 [23827.562653] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000 [23827.564112] page dumped because: kasan: bad access detected [23827.565439] [23827.565932] Memory state around the buggy address: [23827.566917] ffff8881d1322180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [23827.568485] ffff8881d1322200: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [23827.569818] >ffff8881d1322280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [23827.571143] ^ [23827.571879] ffff8881d1322300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [23827.573283] ffff8881d1322380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [23827.574654] ================================================================== Most of the necessary logic is already correctly implemented by mlx5e_get_next_valid_encap() helper that is used in neigh stats update handler. Make the handler generic by renaming it to mlx5e_get_next_matching_encap() and use callback to test whether flow is matching instead of hardcoded check for 'valid' flag value. Implement mlx5e_get_next_valid_encap() by calling mlx5e_get_next_matching_encap() with callback that tests encap MLX5_ENCAP_ENTRY_VALID flag. Implement new mlx5e_get_next_init_encap() helper by calling mlx5e_get_next_matching_encap() with callback that tests encap completion result to be non-error and use it in mlx5e_rep_neigh_update() to safely iterate over nhe->encap_list. Remove encap completion logic from mlx5e_rep_update_flows() since the encap entries passed to this function are already guaranteed to be properly initialized by similar code in mlx5e_get_next_init_encap(). Fixes: 2a1f1768fa17 ("net/mlx5e: Refactor neigh update for concurrent execution") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> 10 June 2021, 00:20:03 UTC
2bf8d2a net/mlx5e: Fix an error code in mlx5e_arfs_create_tables() When the code execute 'if (!priv->fs.arfs->wq)', the value of err is 0. So, we use -ENOMEM to indicate that the function create_singlethread_workqueue() return NULL. Clean up smatch warning: drivers/net/ethernet/mellanox/mlx5/core/en_arfs.c:373 mlx5e_arfs_create_tables() warn: missing error code 'err'. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Fixes: f6755b80d693 ("net/mlx5e: Dynamic alloc arfs table for netdev when needed") Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> 10 June 2021, 00:20:02 UTC
6cde05a Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-06-09 This series contains updates to ice driver only. Maciej informs the user when XDP is not supported due to the driver being in the 'safe mode' state. He also adds a parameter to Tx queue configuration to resolve an issue in configuring XDP queues as it cannot rely on using the number Tx or Rx queues. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 09 June 2021, 22:45:16 UTC
13c62f5 net/sched: act_ct: handle DNAT tuple collision This this the counterpart of 8aa7b526dc0b ("openvswitch: handle DNAT tuple collision") for act_ct. From that commit changelog: """ With multiple DNAT rules it's possible that after destination translation the resulting tuples collide. ... Netfilter handles this case by allocating a null binding for SNAT at egress by default. Perform the same operation in openvswitch for DNAT if no explicit SNAT is requested by the user and allocate a null binding for SNAT for packets in the "original" direction. """ Fixes: 95219afbb980 ("act_ct: support asymmetric conntrack") Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> 09 June 2021, 22:34:51 UTC
d2e381c rtnetlink: Fix regression in bridge VLAN configuration Cited commit started returning errors when notification info is not filled by the bridge driver, resulting in the following regression: # ip link add name br1 type bridge vlan_filtering 1 # bridge vlan add dev br1 vid 555 self pvid untagged RTNETLINK answers: Invalid argument As long as the bridge driver does not fill notification info for the bridge device itself, an empty notification should not be considered as an error. This is explained in commit 59ccaaaa49b5 ("bridge: dont send notification when skb->len == 0 in rtnl_bridge_notify"). Fix by removing the error and add a comment to avoid future bugs. Fixes: a8db57c1d285 ("rtnetlink: Fix missing error code in rtnl_bridge_notify()") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> 09 June 2021, 21:58:26 UTC
93124d4 Merge tag 'mac80211-for-net-2021-06-09' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes berg says: ==================== A fair number of fixes: * fix more fallout from RTNL locking changes * fixes for some of the bugs found by syzbot * drop multicast fragments in mac80211 to align with the spec and what drivers are doing now * fix NULL-ptr deref in radiotap injection ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 09 June 2021, 21:46:21 UTC
back to top