https://github.com/torvalds/linux

sort by:
Revision Author Date Message Commit Date
0c10455 ptp: ocp: Select CRC16 in the Kconfig. The crc16() function is used to check the firmware validity, but the library was not explicitly selected. Fixes: 3c3673bde50c ("ptp: ocp: Add firmware header checks") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Acked-by: Vadim Fedorenko <vadfed@fb.com> Link: https://lore.kernel.org/r/20220726220604.1339972-1-jonathan.lemon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> 28 July 2022, 01:11:34 UTC
e62d2e1 tcp: md5: fix IPv4-mapped support After the blamed commit, IPv4 SYN packets handled by a dual stack IPv6 socket are dropped, even if perfectly valid. $ nstat | grep MD5 TcpExtTCPMD5Failure 5 0.0 For a dual stack listener, an incoming IPv4 SYN packet would call tcp_inbound_md5_hash() with @family == AF_INET, while tp->af_specific is pointing to tcp_sock_ipv6_specific. Only later when an IPv4-mapped child is created, tp->af_specific is changed to tcp_sock_ipv6_mapped_specific. Fixes: 7bbb765b7349 ("net/tcp: Merge TCP-MD5 inbound callbacks") Reported-by: Brian Vazquez <brianvv@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Dmitry Safonov <dima@arista.com> Tested-by: Leonard Crestez <cdleonard@gmail.com> Link: https://lore.kernel.org/r/20220726115743.2759832-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> 27 July 2022, 17:18:21 UTC
5a15912 virtio-net: fix the race between refill work and close We try using cancel_delayed_work_sync() to prevent the work from enabling NAPI. This is insufficient since we don't disable the source of the refill work scheduling. This means an NAPI poll callback after cancel_delayed_work_sync() can schedule the refill work then can re-enable the NAPI that leads to use-after-free [1]. Since the work can enable NAPI, we can't simply disable NAPI before calling cancel_delayed_work_sync(). So fix this by introducing a dedicated boolean to control whether or not the work could be scheduled from NAPI. [1] ================================================================== BUG: KASAN: use-after-free in refill_work+0x43/0xd4 Read of size 2 at addr ffff88810562c92e by task kworker/2:1/42 CPU: 2 PID: 42 Comm: kworker/2:1 Not tainted 5.19.0-rc1+ #480 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Workqueue: events refill_work Call Trace: <TASK> dump_stack_lvl+0x34/0x44 print_report.cold+0xbb/0x6ac ? _printk+0xad/0xde ? refill_work+0x43/0xd4 kasan_report+0xa8/0x130 ? refill_work+0x43/0xd4 refill_work+0x43/0xd4 process_one_work+0x43d/0x780 worker_thread+0x2a0/0x6f0 ? process_one_work+0x780/0x780 kthread+0x167/0x1a0 ? kthread_exit+0x50/0x50 ret_from_fork+0x22/0x30 </TASK> ... Fixes: b2baed69e605c ("virtio_net: set/cancel work on ndo_open/ndo_stop") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net> 27 July 2022, 12:20:44 UTC
b5177ed mptcp: Do not return EINPROGRESS when subflow creation succeeds New subflows are created within the kernel using O_NONBLOCK, so EINPROGRESS is the expected return value from kernel_connect(). __mptcp_subflow_connect() has the correct logic to consider EINPROGRESS to be a successful case, but it has also used that error code as its return value. Before v5.19 this was benign: all the callers ignored the return value. Starting in v5.19 there is a MPTCP_PM_CMD_SUBFLOW_CREATE generic netlink command that does use the return value, so the EINPROGRESS gets propagated to userspace. Make __mptcp_subflow_connect() always return 0 on success instead. Fixes: ec3edaa7ca6c ("mptcp: Add handling of outgoing MP_JOIN requests") Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment") Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Link: https://lore.kernel.org/r/20220725205231.87529-1-mathew.j.martineau@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> 27 July 2022, 02:57:55 UTC
e77ea97 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Florian Westphal says: ==================== netfilter updates for net Three late fixes for netfilter: 1) If nf_queue user requests packet truncation below size of l3 header, we corrupt the skb, then crash. Reject such requests. 2) add cond_resched() calls when doing cycle detection in the nf_tables graph. This avoids softlockup warning with certain rulesets. 3) Reject rulesets that use nftables 'queue' expression in family/chain combinations other than those that are supported. Currently the ruleset will load, but when userspace attempts to reinject you get WARN splat + packet drops. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nft_queue: only allow supported familes and hooks netfilter: nf_tables: add rescheduling points during loop detection walks netfilter: nf_queue: do not allow packet truncation below transport header offset ==================== Link: https://lore.kernel.org/r/20220726192056.13497-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org> 27 July 2022, 02:53:09 UTC
e53f529 Merge tag 'for-net-2022-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - Fix early wakeup after suspend - Fix double free on error - Fix use-after-free on l2cap_chan_put * tag 'for-net-2022-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put Bluetooth: Always set event mask on suspend Bluetooth: mgmt: Fix double free on error path ==================== Link: https://lore.kernel.org/r/20220726221328.423714-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> 27 July 2022, 02:48:24 UTC
d0be834 Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put This fixes the following trace which is caused by hci_rx_work starting up *after* the final channel reference has been put() during sock_close() but *before* the references to the channel have been destroyed, so instead the code now rely on kref_get_unless_zero/l2cap_chan_hold_unless_zero to prevent referencing a channel that is about to be destroyed. refcount_t: increment on 0; use-after-free. BUG: KASAN: use-after-free in refcount_dec_and_test+0x20/0xd0 Read of size 4 at addr ffffffc114f5bf18 by task kworker/u17:14/705 CPU: 4 PID: 705 Comm: kworker/u17:14 Tainted: G S W 4.14.234-00003-g1fb6d0bd49a4-dirty #28 Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150 Google Inc. MSM sm8150 Flame DVT (DT) Workqueue: hci0 hci_rx_work Call trace: dump_backtrace+0x0/0x378 show_stack+0x20/0x2c dump_stack+0x124/0x148 print_address_description+0x80/0x2e8 __kasan_report+0x168/0x188 kasan_report+0x10/0x18 __asan_load4+0x84/0x8c refcount_dec_and_test+0x20/0xd0 l2cap_chan_put+0x48/0x12c l2cap_recv_frame+0x4770/0x6550 l2cap_recv_acldata+0x44c/0x7a4 hci_acldata_packet+0x100/0x188 hci_rx_work+0x178/0x23c process_one_work+0x35c/0x95c worker_thread+0x4cc/0x960 kthread+0x1a8/0x1c4 ret_from_fork+0x10/0x18 Cc: stable@kernel.org Reported-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Tested-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> 26 July 2022, 20:35:24 UTC
ef61b6e Bluetooth: Always set event mask on suspend When suspending, always set the event mask once disconnects are successful. Otherwise, if wakeup is disallowed, the event mask is not set before suspend continues and can result in an early wakeup. Fixes: 182ee45da083 ("Bluetooth: hci_sync: Rework hci_suspend_notifier") Cc: stable@vger.kernel.org Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> 26 July 2022, 20:35:13 UTC
4b2f4e0 Bluetooth: mgmt: Fix double free on error path Don't call mgmt_pending_remove() twice (double free). Fixes: 6b88eff43704 ("Bluetooth: hci_sync: Refactor remove Adv Monitor") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> 26 July 2022, 20:32:40 UTC
aa40d5a wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop() lockdep complains use of uninitialized spinlock at ieee80211_do_stop() [1], for commit f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif that is being stopped") guards clear_bit() using fq.lock even before fq_init() from ieee80211_txq_setup_flows() initializes this spinlock. According to discussion [2], Toke was not happy with expanding usage of fq.lock. Since __ieee80211_wake_txqs() is called under RCU read lock, we can instead use synchronize_rcu() for flushing ieee80211_wake_txqs(). Link: https://syzkaller.appspot.com/bug?extid=eceab52db7c4b961e9d6 [1] Link: https://lkml.kernel.org/r/874k0zowh2.fsf@toke.dk [2] Reported-by: syzbot <syzbot+eceab52db7c4b961e9d6@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif that is being stopped") Tested-by: syzbot <syzbot+eceab52db7c4b961e9d6@syzkaller.appspotmail.com> Acked-by: Toke Høiland-Jørgensen <toke@kernel.org> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/9cc9b81d-75a3-3925-b612-9d0ad3cab82b@I-love.SAKURA.ne.jp [ pick up commit 3598cb6e1862 ("wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()") from -next] Link: https://lore.kernel.org/all/87o7xcq6qt.fsf@kernel.org/ Signed-off-by: Jakub Kicinski <kuba@kernel.org> 26 July 2022, 20:23:05 UTC
47f4f51 netfilter: nft_queue: only allow supported familes and hooks Trying to use 'queue' statement in ingress (for example) triggers a splat on reinject: WARNING: CPU: 3 PID: 1345 at net/netfilter/nf_queue.c:291 ... because nf_reinject cannot find the ruleset head. The netdev family doesn't support async resume at the moment anyway, so disallow loading such rulesets with a more appropriate error message. v2: add 'validate' callback and also check hook points, v1 did allow ingress use in 'table inet', but that doesn't work either. (Pablo) Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org> 26 July 2022, 19:12:42 UTC
81ea010 netfilter: nf_tables: add rescheduling points during loop detection walks Add explicit rescheduling points during ruleset walk. Switching to a faster algorithm is possible but this is a much smaller change, suitable for nf tree. Link: https://bugzilla.netfilter.org/show_bug.cgi?id=1460 Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> 26 July 2022, 19:12:42 UTC
99a63d3 netfilter: nf_queue: do not allow packet truncation below transport header offset Domingo Dirutigliano and Nicola Guerrera report kernel panic when sending nf_queue verdict with 1-byte nfta_payload attribute. The IP/IPv6 stack pulls the IP(v6) header from the packet after the input hook. If user truncates the packet below the header size, this skb_pull() will result in a malformed skb (skb->len < 0). Fixes: 7af4cc3fa158 ("[NETFILTER]: Add "nfnetlink_queue" netfilter queue handler over nfnetlink") Reported-by: Domingo Dirutigliano <pwnzer0tt1@proton.me> Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org> 26 July 2022, 19:12:42 UTC
9b134b1 bridge: Do not send empty IFLA_AF_SPEC attribute After commit b6c02ef54913 ("bridge: Netlink interface fix."), br_fill_ifinfo() started to send an empty IFLA_AF_SPEC attribute when a bridge vlan dump is requested but an interface does not have any vlans configured. iproute2 ignores such an empty attribute since commit b262a9becbcb ("bridge: Fix output with empty vlan lists") but older iproute2 versions as well as other utilities have their output changed by the cited kernel commit, resulting in failed test cases. Regardless, emitting an empty attribute is pointless and inefficient. Avoid this change by canceling the attribute if no AF_SPEC data was added. Fixes: b6c02ef54913 ("bridge: Netlink interface fix.") Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20220725001236.95062-1-bpoirier@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> 26 July 2022, 13:35:53 UTC
33881ab Merge branch 'octeontx2-minor-tc-fixes' Subbaraya Sundeep says: ==================== Octeontx2 minor tc fixes This patch set fixes two problems found in tc code wrt to ratelimiting and when installing UDP/TCP filters. Patch 1: CN10K has different register format compared to CN9xx hence fixes that. Patch 2: Check flow mask also before installing a src/dst port filter, otherwise installing for one port installs for other one too. ==================== Link: https://lore.kernel.org/r/1658650874-16459-1-git-send-email-sbhatta@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> 26 July 2022, 11:05:46 UTC
59e1be6 octeontx2-pf: Fix UDP/TCP src and dst port tc filters Check the mask for non-zero value before installing tc filters for L4 source and destination ports. Otherwise installing a filter for source port installs destination port too and vice-versa. Fixes: 1d4d9e42c240 ("octeontx2-pf: Add tc flower hardware offload on ingress traffic") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> 26 July 2022, 11:05:44 UTC
b354eae octeontx2-pf: cn10k: Fix egress ratelimit configuration NIX_AF_TLXX_PIR/CIR register format has changed from OcteonTx2 to CN10K. CN10K supports larger burst size. Fix burst exponent and burst mantissa configuration for CN10K. Also fixed 'maxrate' from u32 to u64 since 'police.rate_bytes_ps' passed by stack is also u64. Fixes: e638a83f167e ("octeontx2-pf: TC_MATCHALL egress ratelimiting offload") Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> 26 July 2022, 11:05:44 UTC
b89fc26 sctp: fix sleep in atomic context bug in timer handlers There are sleep in atomic context bugs in timer handlers of sctp such as sctp_generate_t3_rtx_event(), sctp_generate_probe_event(), sctp_generate_t1_init_event(), sctp_generate_timeout_event(), sctp_generate_t3_rtx_event() and so on. The root cause is sctp_sched_prio_init_sid() with GFP_KERNEL parameter that may sleep could be called by different timer handlers which is in interrupt context. One of the call paths that could trigger bug is shown below: (interrupt context) sctp_generate_probe_event sctp_do_sm sctp_side_effects sctp_cmd_interpreter sctp_outq_teardown sctp_outq_init sctp_sched_set_sched n->init_sid(..,GFP_KERNEL) sctp_sched_prio_init_sid //may sleep This patch changes gfp_t parameter of init_sid in sctp_sched_set_sched() from GFP_KERNEL to GFP_ATOMIC in order to prevent sleep in atomic context bugs. Fixes: 5bbbbe32a431 ("sctp: introduce stream scheduler foundations") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Link: https://lore.kernel.org/r/20220723015809.11553-1-duoming@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> 26 July 2022, 02:39:05 UTC
c7560d1 net: dsa: fix reference counting for LAG FDBs Due to an invalid conflict resolution on my side while working on 2 different series (LAG FDBs and FDB isolation), dsa_switch_do_lag_fdb_add() does not store the database associated with a dsa_mac_addr structure. So after adding an FDB entry associated with a LAG, dsa_mac_addr_find() fails to find it while deleting it, because &a->db is zeroized memory for all stored FDB entries of lag->fdbs, and dsa_switch_do_lag_fdb_del() returns -ENOENT rather than deleting the entry. Fixes: c26933639b54 ("net: dsa: request drivers to perform FDB isolation") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220723012411.1125066-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> 26 July 2022, 02:37:06 UTC
5fcbb71 i40e: Fix interface init with MSI interrupts (no MSI-X) Fix the inability to bring an interface up on a setup with only MSI interrupts enabled (no MSI-X). Solution is to add a default number of QPs = 1. This is enough, since without MSI-X support driver enables only a basic feature set. Fixes: bc6d33c8d93f ("i40e: Fix the number of queues available to be mapped for use") Signed-off-by: Dawid Lukwinski <dawid.lukwinski@intel.com> Signed-off-by: Michal Maloszewski <michal.maloszewski@intel.com> Tested-by: Dave Switzer <david.switzer@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20220722175401.112572-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> 26 July 2022, 01:48:41 UTC
9af0620 Merge branch 'net-sysctl-races-part-6' Kuniyuki Iwashima says: ==================== sysctl: Fix data-races around ipv4_net_table (Round 6, Final). This series fixes data-races around 11 knobs after tcp_pacing_ss_ratio ipv4_net_table, and this is the final round for ipv4_net_table. While at it, other data-races around these related knobs are fixed. - decnet_mem - decnet_rmem - tipc_rmem There are still 58 tables possibly missing some fixes under net/. $ grep -rnE "struct ctl_table.*?\[\] =" net/ | wc -l 60 ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 25 July 2022, 11:42:10 UTC
96b9bd8 ipv4: Fix data-races around sysctl_fib_notify_on_flag_change. While reading sysctl_fib_notify_on_flag_change, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 680aea08e78c ("net: ipv4: Emit notification when fib hardware flags are changed") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 25 July 2022, 11:42:10 UTC
870e3a6 tcp: Fix data-races around sysctl_tcp_reflect_tos. While reading sysctl_tcp_reflect_tos, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: ac8f1710c12b ("tcp: reflect tos value received in SYN to the socket") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Wei Wang <weiwan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> 25 July 2022, 11:42:10 UTC
79f5547 tcp: Fix a data-race around sysctl_tcp_comp_sack_nr. While reading sysctl_tcp_comp_sack_nr, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 9c21d2fc41c0 ("tcp: add tcp_comp_sack_nr sysctl") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 25 July 2022, 11:42:10 UTC
2239694 tcp: Fix a data-race around sysctl_tcp_comp_sack_slack_ns. While reading sysctl_tcp_comp_sack_slack_ns, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: a70437cc09a1 ("tcp: add hrtimer slack to sack compression") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 25 July 2022, 11:42:10 UTC
4866b2b tcp: Fix a data-race around sysctl_tcp_comp_sack_delay_ns. While reading sysctl_tcp_comp_sack_delay_ns, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 6d82aa242092 ("tcp: add tcp_comp_sack_delay_ns sysctl") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 25 July 2022, 11:42:09 UTC
0273954 net: Fix data-races around sysctl_[rw]mem(_offset)?. While reading these sysctl variables, they can be changed concurrently. Thus, we need to add READ_ONCE() to their readers. - .sysctl_rmem - .sysctl_rwmem - .sysctl_rmem_offset - .sysctl_wmem_offset - sysctl_tcp_rmem[1, 2] - sysctl_tcp_wmem[1, 2] - sysctl_decnet_rmem[1] - sysctl_decnet_wmem[1] - sysctl_tipc_rmem[1] Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 25 July 2022, 11:42:09 UTC
59bf6c6 tcp: Fix data-races around sk_pacing_rate. While reading sysctl_tcp_pacing_(ss|ca)_ratio, they can be changed concurrently. Thus, we need to add READ_ONCE() to their readers. Fixes: 43e122b014c9 ("tcp: refine pacing rate determination") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 25 July 2022, 11:42:09 UTC
3e7d18b net: mld: fix reference count leak in mld_{query | report}_work() mld_{query | report}_work() processes queued events. If there are too many events in the queue, it re-queue a work. And then, it returns without in6_dev_put(). But if queuing is failed, it should call in6_dev_put(), but it doesn't. So, a reference count leak would occur. THREAD0 THREAD1 mld_report_work() spin_lock_bh() if (!mod_delayed_work()) in6_dev_hold(); spin_unlock_bh() spin_lock_bh() schedule_delayed_work() spin_unlock_bh() Script to reproduce(by Hangbin Liu): ip netns add ns1 ip netns add ns2 ip netns exec ns1 sysctl -w net.ipv6.conf.all.force_mld_version=1 ip netns exec ns2 sysctl -w net.ipv6.conf.all.force_mld_version=1 ip -n ns1 link add veth0 type veth peer name veth0 netns ns2 ip -n ns1 link set veth0 up ip -n ns2 link set veth0 up for i in `seq 50`; do for j in `seq 100`; do ip -n ns1 addr add 2021:${i}::${j}/64 dev veth0 ip -n ns2 addr add 2022:${i}::${j}/64 dev veth0 done done modprobe -r veth ip -a netns del splat looks like: unregister_netdevice: waiting for veth0 to become free. Usage count = 2 leaked reference. ipv6_add_dev+0x324/0xec0 addrconf_notify+0x481/0xd10 raw_notifier_call_chain+0xe3/0x120 call_netdevice_notifiers+0x106/0x160 register_netdevice+0x114c/0x16b0 veth_newlink+0x48b/0xa50 [veth] rtnl_newlink+0x11a2/0x1a40 rtnetlink_rcv_msg+0x63f/0xc00 netlink_rcv_skb+0x1df/0x3e0 netlink_unicast+0x5de/0x850 netlink_sendmsg+0x6c9/0xa90 ____sys_sendmsg+0x76a/0x780 __sys_sendmsg+0x27c/0x340 do_syscall_64+0x43/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd Tested-by: Hangbin Liu <liuhangbin@gmail.com> Fixes: f185de28d9ae ("mld: add new workqueues for process mld events") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> 25 July 2022, 11:33:59 UTC
c7b205f net: macsec: fix potential resource leak in macsec_add_rxsa() and macsec_add_txsa() init_rx_sa() allocates relevant resource for rx_sa->stats and rx_sa-> key.tfm with alloc_percpu() and macsec_alloc_tfm(). When some error occurs after init_rx_sa() is called in macsec_add_rxsa(), the function released rx_sa with kfree() without releasing rx_sa->stats and rx_sa-> key.tfm, which will lead to a resource leak. We should call macsec_rxsa_put() instead of kfree() to decrease the ref count of rx_sa and release the relevant resource if the refcount is 0. The same bug exists in macsec_add_txsa() for tx_sa as well. This patch fixes the above two bugs. Fixes: 3cf3227a21d1 ("net: macsec: hardware offloading infrastructure") Signed-off-by: Jianglei Nie <niejianglei2021@163.com> Signed-off-by: David S. Miller <davem@davemloft.net> 25 July 2022, 10:51:05 UTC
20a8546 Merge branch 'macsec-config-issues' Sabrina Dubroca says: ==================== macsec: fix config issues The patch adding netlink support for XPN (commit 48ef50fa866a ("macsec: Netlink support of XPN cipher suites (IEEE 802.1AEbw)")) introduced several issues, including a kernel panic reported at [1]. Reproducing those bugs with upstream iproute is limited, since iproute doesn't currently support XPN. I'm also working on this. [1] https://bugzilla.kernel.org/show_bug.cgi?id=208315 ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 25 July 2022, 10:49:25 UTC
c630d1f macsec: always read MACSEC_SA_ATTR_PN as a u64 Currently, MACSEC_SA_ATTR_PN is handled inconsistently, sometimes as a u32, sometimes forced into a u64 without checking the actual length of the attribute. Instead, we can use nla_get_u64 everywhere, which will read up to 64 bits into a u64, capped by the actual length of the attribute coming from userspace. This fixes several issues: - the check in validate_add_rxsa doesn't work with 32-bit attributes - the checks in validate_add_txsa and validate_upd_sa incorrectly reject X << 32 (with X != 0) Fixes: 48ef50fa866a ("macsec: Netlink support of XPN cipher suites (IEEE 802.1AEbw)") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> 25 July 2022, 10:49:25 UTC
b07a0e2 macsec: limit replay window size with XPN IEEE 802.1AEbw-2013 (section 10.7.8) specifies that the maximum value of the replay window is 2^30-1, to help with recovery of the upper bits of the PN. To avoid leaving the existing macsec device in an inconsistent state if this test fails during changelink, reuse the cleanup mechanism introduced for HW offload. This wasn't needed until now because macsec_changelink_common could not fail during changelink, as modifying the cipher suite was not allowed. Finally, this must happen after handling IFLA_MACSEC_CIPHER_SUITE so that secy->xpn is set. Fixes: 48ef50fa866a ("macsec: Netlink support of XPN cipher suites (IEEE 802.1AEbw)") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> 25 July 2022, 10:49:25 UTC
3240eac macsec: fix error message in macsec_add_rxsa and _txsa The expected length is MACSEC_SALT_LEN, not MACSEC_SA_ATTR_SALT. Fixes: 48ef50fa866a ("macsec: Netlink support of XPN cipher suites (IEEE 802.1AEbw)") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> 25 July 2022, 10:49:25 UTC
f46040e macsec: fix NULL deref in macsec_add_rxsa Commit 48ef50fa866a added a test on tb_sa[MACSEC_SA_ATTR_PN], but nothing guarantees that it's not NULL at this point. The same code was added to macsec_add_txsa, but there it's not a problem because validate_add_txsa checks that the MACSEC_SA_ATTR_PN attribute is present. Note: it's not possible to reproduce with iproute, because iproute doesn't allow creating an SA without specifying the PN. Fixes: 48ef50fa866a ("macsec: Netlink support of XPN cipher suites (IEEE 802.1AEbw)") Link: https://bugzilla.kernel.org/show_bug.cgi?id=208315 Reported-by: Frantisek Sumsal <fsumsal@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> 25 July 2022, 10:49:25 UTC
1aaa62c s390/qeth: Fix typo 'the the' in comment Replace 'the the' with 'the' in the comment. Signed-off-by: Slark Xiao <slark_xiao@163.com> Signed-off-by: David S. Miller <davem@davemloft.net> 25 July 2022, 09:52:28 UTC
2540d3c net: ipa: Fix typo 'the the' in comment Replace 'the the' with 'the' in the comment. Signed-off-by: Slark Xiao <slark_xiao@163.com> Signed-off-by: David S. Miller <davem@davemloft.net> 25 July 2022, 09:52:28 UTC
af35f95 nfp: bpf: Fix typo 'the the' in comment Replace 'the the' with 'the' in the comment. Signed-off-by: Slark Xiao <slark_xiao@163.com> Acked-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net> 25 July 2022, 09:52:28 UTC
aa709da Documentation: fix sctp_wmem in ip-sysctl.rst Since commit 1033990ac5b2 ("sctp: implement memory accounting on tx path"), SCTP has supported memory accounting on tx path where 'sctp_wmem' is used by sk_wmem_schedule(). So we should fix the description for this option in ip-sysctl.rst accordingly. v1->v2: - Improve the description as Marcelo suggested. Fixes: 1033990ac5b2 ("sctp: implement memory accounting on tx path") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> 24 July 2022, 20:41:58 UTC
f633672 net/tls: Remove the context from the list in tls_device_down tls_device_down takes a reference on all contexts it's going to move to the degraded state (software fallback). If sk_destruct runs afterwards, it can reduce the reference counter back to 1 and return early without destroying the context. Then tls_device_down will release the reference it took and call tls_device_free_ctx. However, the context will still stay in tls_device_down_list forever. The list will contain an item, memory for which is released, making a memory corruption possible. Fix the above bug by properly removing the context from all lists before any call to tls_device_free_ctx. Fixes: 3740651bf7e2 ("tls: Fix context leak on tls_device_down") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> 24 July 2022, 20:40:56 UTC
4d8f24e Revert "tcp: change pingpong threshold to 3" This reverts commit 4a41f453bedfd5e9cd040bad509d9da49feb3e2c. This to-be-reverted commit was meant to apply a stricter rule for the stack to enter pingpong mode. However, the condition used to check for interactive session "before(tp->lsndtime, icsk->icsk_ack.lrcvtime)" is jiffy based and might be too coarse, which delays the stack entering pingpong mode. We revert this patch so that we no longer use the above condition to determine interactive session, and also reduce pingpong threshold to 1. Fixes: 4a41f453bedf ("tcp: change pingpong threshold to 3") Reported-by: LemmyHuang <hlm3280@163.com> Suggested-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220721204404.388396-1-weiwan@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> 22 July 2022, 22:09:10 UTC
8ee18e2 caif: Fix bitmap data type in "struct caifsock" Bitmap are "unsigned long", so use it instead of a "u32" to make things more explicit. While at it, remove some useless cast (and leading spaces) when using the bitmap API. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: David S. Miller <davem@davemloft.net> 22 July 2022, 11:51:45 UTC
030f21b dt-bindings: net: fsl,fec: Add missing types to phy-reset-* properties The phy-reset-* properties are missing type definitions and are not common properties. Even though they are deprecated, a type is needed. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> 22 July 2022, 11:38:16 UTC
17161c3 dt-bindings: net: ethernet-controller: Rework 'fixed-link' schema While the if/then schemas mostly work, there's a few issues. The 'allOf' schema will also be true if 'fixed-link' is not an array or object as a false 'if' schema (without an 'else') will be true. In the array case doesn't set the type (uint32-array) in the 'then' clause. In the node case, 'additionalProperties' is missing. Rework the schema to use oneOf with each possible type. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> 22 July 2022, 11:38:16 UTC
b20a7ca Merge branch 'sysctl-races-part-5' Kuniyuki Iwashima says: ==================== sysctl: Fix data-races around ipv4_net_table (Round 5). This series fixes data-races around 15 knobs after tcp_dsack in ipv4_net_table. tcp_tso_win_divisor was skipped because it already uses READ_ONCE(). So, the final round for ipv4_net_table will start with tcp_pacing_ss_ratio. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 22 July 2022, 11:06:18 UTC
2afdbe7 tcp: Fix a data-race around sysctl_tcp_invalid_ratelimit. While reading sysctl_tcp_invalid_ratelimit, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 032ee4236954 ("tcp: helpers to mitigate ACK loops by rate-limiting out-of-window dupacks") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 22 July 2022, 11:06:18 UTC
85225e6 tcp: Fix a data-race around sysctl_tcp_autocorking. While reading sysctl_tcp_autocorking, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: f54b311142a9 ("tcp: auto corking") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 22 July 2022, 11:06:18 UTC
1330ffa tcp: Fix a data-race around sysctl_tcp_min_rtt_wlen. While reading sysctl_tcp_min_rtt_wlen, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: f672258391b4 ("tcp: track min RTT using windowed min-filter") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 22 July 2022, 11:06:18 UTC
2455e61 tcp: Fix a data-race around sysctl_tcp_tso_rtt_log. While reading sysctl_tcp_tso_rtt_log, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 65466904b015 ("tcp: adjust TSO packet sizes based on min_rtt") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 22 July 2022, 11:06:17 UTC
e0bb4ab tcp: Fix a data-race around sysctl_tcp_min_tso_segs. While reading sysctl_tcp_min_tso_segs, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 95bd09eb2750 ("tcp: TSO packets automatic sizing") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 22 July 2022, 11:06:17 UTC
db3815a tcp: Fix a data-race around sysctl_tcp_challenge_ack_limit. While reading sysctl_tcp_challenge_ack_limit, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 282f23c6ee34 ("tcp: implement RFC 5961 3.2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 22 July 2022, 11:06:17 UTC
9fb9019 tcp: Fix a data-race around sysctl_tcp_limit_output_bytes. While reading sysctl_tcp_limit_output_bytes, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 46d3ceabd8d9 ("tcp: TCP Small Queues") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 22 July 2022, 11:06:17 UTC
0f1e4d0 tcp: Fix data-races around sysctl_tcp_workaround_signed_windows. While reading sysctl_tcp_workaround_signed_windows, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 15d99e02baba ("[TCP]: sysctl to allow TCP window > 32767 sans wscale") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 22 July 2022, 11:06:17 UTC
7804764 tcp: Fix data-races around sysctl_tcp_moderate_rcvbuf. While reading sysctl_tcp_moderate_rcvbuf, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 22 July 2022, 11:06:17 UTC
ab1ba21 tcp: Fix data-races around sysctl_tcp_no_ssthresh_metrics_save. While reading sysctl_tcp_no_ssthresh_metrics_save, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 65e6d90168f3 ("net-tcp: Disable TCP ssthresh metrics cache by default") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 22 July 2022, 11:06:17 UTC
8499a24 tcp: Fix a data-race around sysctl_tcp_nometrics_save. While reading sysctl_tcp_nometrics_save, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 22 July 2022, 11:06:17 UTC
706c620 tcp: Fix a data-race around sysctl_tcp_frto. While reading sysctl_tcp_frto, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 22 July 2022, 11:06:17 UTC
36eeee7 tcp: Fix a data-race around sysctl_tcp_adv_win_scale. While reading sysctl_tcp_adv_win_scale, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 22 July 2022, 11:06:17 UTC
02ca527 tcp: Fix a data-race around sysctl_tcp_app_win. While reading sysctl_tcp_app_win, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 22 July 2022, 11:06:17 UTC
58ebb1c tcp: Fix data-races around sysctl_tcp_dsack. While reading sysctl_tcp_dsack, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 22 July 2022, 11:06:17 UTC
ebbbe23 net: sungem_phy: Add of_node_put() for reference returned by of_get_parent() In bcm5421_init(), we should call of_node_put() for the reference returned by of_get_parent() which has increased the refcount. Fixes: 3c326fe9cb7a ("[PATCH] ppc64: Add new PHY to sungem") Signed-off-by: Liang He <windhl@126.com> Link: https://lore.kernel.org/r/20220720131003.1287426-1-windhl@126.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> 22 July 2022, 02:04:19 UTC
27161db net: pcs: xpcs: propagate xpcs_read error to xpcs_get_state_c37_sgmii While phylink_pcs_ops :: pcs_get_state does return void, xpcs_get_state() does check for a non-zero return code from xpcs_get_state_c37_sgmii() and prints that as a message to the kernel log. However, a non-zero return code from xpcs_read() is translated into "return false" (i.e. zero as int) and the I/O error is therefore not printed. Fix that. Fixes: b97b5331b8ab ("net: pcs: add C37 SGMII AN support for intel mGbE controller") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220720112057.3504398-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> 22 July 2022, 02:01:16 UTC
7ca433d Merge tag 'net-5.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from can. Still no major regressions, most of the changes are still due to data races fixes, plus the usual bunch of drivers fixes. Previous releases - regressions: - tcp/udp: make early_demux back namespacified. - dsa: fix issues with vlan_filtering_is_global Previous releases - always broken: - ip: fix data-races around ipv4_net_table (round 2, 3 & 4) - amt: fix validation and synchronization bugs - can: fix detection of mcp251863 - eth: iavf: fix handling of dummy receive descriptors - eth: lan966x: fix issues with MAC table - eth: stmmac: dwmac-mediatek: fix clock issue Misc: - dsa: update documentation" * tag 'net-5.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (107 commits) mlxsw: spectrum_router: Fix IPv4 nexthop gateway indication net/sched: cls_api: Fix flow action initialization tcp: Fix data-races around sysctl_tcp_max_reordering. tcp: Fix a data-race around sysctl_tcp_abort_on_overflow. tcp: Fix a data-race around sysctl_tcp_rfc1337. tcp: Fix a data-race around sysctl_tcp_stdurg. tcp: Fix a data-race around sysctl_tcp_retrans_collapse. tcp: Fix data-races around sysctl_tcp_slow_start_after_idle. tcp: Fix a data-race around sysctl_tcp_thin_linear_timeouts. tcp: Fix data-races around sysctl_tcp_recovery. tcp: Fix a data-race around sysctl_tcp_early_retrans. tcp: Fix data-races around sysctl knobs related to SYN option. udp: Fix a data-race around sysctl_udp_l3mdev_accept. ip: Fix data-races around sysctl_ip_prot_sock. ipv4: Fix data-races around sysctl_fib_multipath_hash_fields. ipv4: Fix data-races around sysctl_fib_multipath_hash_policy. ipv4: Fix a data-race around sysctl_fib_multipath_use_neigh. can: rcar_canfd: Add missing of_node_put() in rcar_canfd_probe() can: mcp251xfd: fix detection of mcp251863 Documentation: fix udp_wmem_min in ip-sysctl.rst ... 21 July 2022, 18:08:35 UTC
b67fbeb mmu_gather: Force tlb-flush VM_PFNMAP vmas Jann reported a race between munmap() and unmap_mapping_range(), where unmap_mapping_range() will no-op once unmap_vmas() has unlinked the VMA; however munmap() will not yet have invalidated the TLBs. Therefore unmap_mapping_range() will complete while there are still (stale) TLB entries for the specified range. Mitigate this by force flushing TLBs for VM_PFNMAP ranges. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 21 July 2022, 17:50:13 UTC
18ba064 mmu_gather: Let there be one tlb_{start,end}_vma() implementation Now that architectures are no longer allowed to override tlb_{start,end}_vma() re-arrange code so that there is only one implementation for each of these functions. This much simplifies trying to figure out what they actually do. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 21 July 2022, 17:50:13 UTC
1d7708e csky/tlb: Remove tlb_flush() define The previous patch removed the tlb_flush_end() implementation which used tlb_flush_range(). This means: - csky did double invalidates, a range invalidate per vma and a full invalidate at the end - csky actually has range invalidates and as such the generic tlb_flush implementation is more efficient for it. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Tested-by: Guo Ren <guoren@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 21 July 2022, 17:50:13 UTC
1e9fdf2 mmu_gather: Remove per arch tlb_{start,end}_vma() Scattered across the archs are 3 basic forms of tlb_{start,end}_vma(). Provide two new MMU_GATHER_knobs to enumerate them and remove the per arch tlb_{start,end}_vma() implementations. - MMU_GATHER_NO_FLUSH_CACHE indicates the arch has flush_cache_range() but does *NOT* want to call it for each VMA. - MMU_GATHER_MERGE_VMAS indicates the arch wants to merge the invalidate across multiple VMAs if possible. With these it is possible to capture the three forms: 1) empty stubs; select MMU_GATHER_NO_FLUSH_CACHE and MMU_GATHER_MERGE_VMAS 2) start: flush_cache_range(), end: empty; select MMU_GATHER_MERGE_VMAS 3) start: flush_cache_range(), end: flush_tlb_range(); default Obviously, if the architecture does not have flush_cache_range() then it also doesn't need to select MMU_GATHER_NO_FLUSH_CACHE. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Cc: David Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 21 July 2022, 17:50:13 UTC
23a6761 scripts/gdb: Fix gdb 'lx-symbols' command Currently the command 'lx-symbols' in gdb exits with the error`Function "do_init_module" not defined in "kernel/module.c"`. This occurs because the file kernel/module.c was moved to kernel/module/main.c. Fix this breakage by changing the path to "kernel/module/main.c" in LoadModuleBreakpoint. Signed-off-by: Khalid Masum <khalid.masum.92@gmail.com> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Fixes: cfc1d277891e ("module: Move all into module/") Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 21 July 2022, 17:40:55 UTC
44e29e6 watch-queue: remove spurious double semicolon Sedat Dilek noticed that I had an extraneous semicolon at the end of a line in the previous patch. It's harmless, but unintentional, and while compilers just treat it as an extra empty statement, for all I know some other tooling might warn about it. So clean it up before other people notice too ;) Fixes: 353f7988dd84 ("watchqueue: make sure to serialize 'wqueue->defunct' properly") Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Sedat Dilek <sedat.dilek@gmail.com> 21 July 2022, 17:30:14 UTC
353f798 watchqueue: make sure to serialize 'wqueue->defunct' properly When the pipe is closed, we mark the associated watchqueue defunct by calling watch_queue_clear(). However, while that is protected by the watchqueue lock, new watchqueue entries aren't actually added under that lock at all: they use the pipe->rd_wait.lock instead, and looking up that pipe happens without any locking. The watchqueue code uses the RCU read-side section to make sure that the wqueue entry itself hasn't disappeared, but that does not protect the pipe_info in any way. So make sure to actually hold the wqueue lock when posting watch events, properly serializing against the pipe being torn down. Reported-by: Noam Rathaus <noamr@ssd-disclosure.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 20 July 2022, 17:46:07 UTC
543ce63 lockdown: Fix kexec lockdown bypass with ima policy The lockdown LSM is primarily used in conjunction with UEFI Secure Boot. This LSM may also be used on machines without UEFI. It can also be enabled when UEFI Secure Boot is disabled. One of lockdown's features is to prevent kexec from loading untrusted kernels. Lockdown can be enabled through a bootparam or after the kernel has booted through securityfs. If IMA appraisal is used with the "ima_appraise=log" boot param, lockdown can be defeated with kexec on any machine when Secure Boot is disabled or unavailable. IMA prevents setting "ima_appraise=log" from the boot param when Secure Boot is enabled, but this does not cover cases where lockdown is used without Secure Boot. To defeat lockdown, boot without Secure Boot and add ima_appraise=log to the kernel command line; then: $ echo "integrity" > /sys/kernel/security/lockdown $ echo "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig" > \ /sys/kernel/security/ima/policy $ kexec -ls unsigned-kernel Add a call to verify ima appraisal is set to "enforce" whenever lockdown is enabled. This fixes CVE-2022-21505. Cc: stable@vger.kernel.org Fixes: 29d3c1c8dfe7 ("kexec: Allow kexec_file() with appropriate IMA policy when locked down") Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com> Acked-by: Mimi Zohar <zohar@linux.ibm.com> Reviewed-by: John Haxby <john.haxby@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> 20 July 2022, 16:56:48 UTC
44484fa Merge tag 'linux-can-fixes-for-5.19-20220720' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== this is a pull request of 2 patches for net/master. The first patch is by me and fixes the detection of the mcp251863 in the mcp251xfd driver. The last patch is by Liang He and adds a missing of_node_put() in the rcar_canfd driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 20 July 2022, 10:13:54 UTC
e5ec6a2 mlxsw: spectrum_router: Fix IPv4 nexthop gateway indication mlxsw needs to distinguish nexthops with a gateway from connected nexthops in order to write the former to the adjacency table of the device. The check used to rely on the fact that nexthops with a gateway have a 'link' scope whereas connected nexthops have a 'host' scope. This is no longer correct after commit 747c14307214 ("ip: fix dflt addr selection for connected nexthop"). Fix that by instead checking the address family of the gateway IP. This is a more direct way and also consistent with the IPv6 counterpart in mlxsw_sp_rt6_is_gateway(). Cc: stable@vger.kernel.org Fixes: 747c14307214 ("ip: fix dflt addr selection for connected nexthop") Fixes: 597cfe4fc339 ("nexthop: Add support for IPv4 nexthops") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> 20 July 2022, 10:00:46 UTC
c0f47c2 net/sched: cls_api: Fix flow action initialization The cited commit refactored the flow action initialization sequence to use an interface method when translating tc action instances to flow offload objects. The refactored version skips the initialization of the generic flow action attributes for tc actions, such as pedit, that allocate more than one offload entry. This can cause potential issues for drivers mapping flow action ids. Populate the generic flow action fields for all the flow action entries. Fixes: c54e1d920f04 ("flow_offload: add ops to tc_action_ops for flow action setup") Signed-off-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> ---- v1 -> v2: - coalese the generic flow action fields initialization to a single loop Reviewed-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net> 20 July 2022, 09:54:27 UTC
3b15b3e Merge branch 'net-sysctl-races-round-4' Kuniyuki Iwashima says: ==================== sysctl: Fix data-races around ipv4_net_table (Round 4). This series fixes data-races around 17 knobs after fib_multipath_use_neigh in ipv4_net_table. tcp_fack was skipped because it's obsolete and there's no readers. So, round 5 will start with tcp_dsack, 2 rounds left for 27 knobs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 20 July 2022, 09:14:50 UTC
a11e5b3 tcp: Fix data-races around sysctl_tcp_max_reordering. While reading sysctl_tcp_max_reordering, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: dca145ffaa8d ("tcp: allow for bigger reordering level") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 20 July 2022, 09:14:50 UTC
2d17d9c tcp: Fix a data-race around sysctl_tcp_abort_on_overflow. While reading sysctl_tcp_abort_on_overflow, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 20 July 2022, 09:14:50 UTC
0b484c9 tcp: Fix a data-race around sysctl_tcp_rfc1337. While reading sysctl_tcp_rfc1337, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 20 July 2022, 09:14:50 UTC
4e08ed4 tcp: Fix a data-race around sysctl_tcp_stdurg. While reading sysctl_tcp_stdurg, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 20 July 2022, 09:14:50 UTC
1a63cb9 tcp: Fix a data-race around sysctl_tcp_retrans_collapse. While reading sysctl_tcp_retrans_collapse, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 20 July 2022, 09:14:50 UTC
4845b57 tcp: Fix data-races around sysctl_tcp_slow_start_after_idle. While reading sysctl_tcp_slow_start_after_idle, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 35089bb203f4 ("[TCP]: Add tcp_slow_start_after_idle sysctl.") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 20 July 2022, 09:14:50 UTC
7c6f2a8 tcp: Fix a data-race around sysctl_tcp_thin_linear_timeouts. While reading sysctl_tcp_thin_linear_timeouts, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 36e31b0af587 ("net: TCP thin linear timeouts") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 20 July 2022, 09:14:50 UTC
e7d2ef8 tcp: Fix data-races around sysctl_tcp_recovery. While reading sysctl_tcp_recovery, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 4f41b1c58a32 ("tcp: use RACK to detect losses") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 20 July 2022, 09:14:50 UTC
52e6586 tcp: Fix a data-race around sysctl_tcp_early_retrans. While reading sysctl_tcp_early_retrans, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: eed530b6c676 ("tcp: early retransmit") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 20 July 2022, 09:14:49 UTC
3666f66 tcp: Fix data-races around sysctl knobs related to SYN option. While reading these knobs, they can be changed concurrently. Thus, we need to add READ_ONCE() to their readers. - tcp_sack - tcp_window_scaling - tcp_timestamps Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 20 July 2022, 09:14:49 UTC
3d72bb4 udp: Fix a data-race around sysctl_udp_l3mdev_accept. While reading sysctl_udp_l3mdev_accept, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 63a6fff353d0 ("net: Avoid receiving packets with an l3mdev on unbound UDP sockets") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 20 July 2022, 09:14:49 UTC
9b55c20 ip: Fix data-races around sysctl_ip_prot_sock. sysctl_ip_prot_sock is accessed concurrently, and there is always a chance of data-race. So, all readers and writers need some basic protection to avoid load/store-tearing. Fixes: 4548b683b781 ("Introduce a sysctl that modifies the value of PROT_SOCK.") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 20 July 2022, 09:14:49 UTC
8895a9c ipv4: Fix data-races around sysctl_fib_multipath_hash_fields. While reading sysctl_fib_multipath_hash_fields, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: ce5c9c20d364 ("ipv4: Add a sysctl to control multipath hash fields") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> 20 July 2022, 09:14:49 UTC
7998c12 ipv4: Fix data-races around sysctl_fib_multipath_hash_policy. While reading sysctl_fib_multipath_hash_policy, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: bf4e0a3db97e ("net: ipv4: add support for ECMP hash policy choice") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 20 July 2022, 09:14:49 UTC
87507bc ipv4: Fix a data-race around sysctl_fib_multipath_use_neigh. While reading sysctl_fib_multipath_use_neigh, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: a6db4494d218 ("net: ipv4: Consider failed nexthops in multipath routes") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> 20 July 2022, 09:14:49 UTC
ef56217 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2022-07-20 1) Fix a policy refcount imbalance in xfrm_bundle_lookup. From Hangyu Hua. 2) Fix some clang -Wformat warnings. Justin Stitt ==================== Signed-off-by: David S. Miller <davem@davemloft.net> 20 July 2022, 09:11:58 UTC
7b66dfc can: rcar_canfd: Add missing of_node_put() in rcar_canfd_probe() We should use of_node_put() for the reference returned by of_get_child_by_name() which has increased the refcount. Fixes: 45721c406dcf ("can: rcar_canfd: Add support for r8a779a0 SoC") Link: https://lore.kernel.org/all/20220712095623.364287-1-windhl@126.com Signed-off-by: Liang He <windhl@126.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> 20 July 2022, 08:20:19 UTC
db87c00 can: mcp251xfd: fix detection of mcp251863 In commit c6f2a617a0a8 ("can: mcp251xfd: add support for mcp251863") support for the mcp251863 was added. However it was not taken into account that the auto detection of the chip model cannot distinguish between mcp2518fd and mcp251863 and would lead to a warning message if the firmware specifies a mcp251863. Fix auto detection: If a mcp2518fd compatible chip is found, keep the mcp251863 if specified by firmware, use mcp2518fd instead. Link: https://lore.kernel.org/all/20220706064835.1848864-1-mkl@pengutronix.de Fixes: c6f2a617a0a8 ("can: mcp251xfd: add support for mcp251863") Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> 20 July 2022, 08:20:19 UTC
48ea8ea Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-07-18 This series contains updates to iavf driver only. Przemyslaw fixes handling of multiple VLAN requests to account for individual errors instead of rejecting them all. He removes incorrect implementations of ETHTOOL_COALESCE_MAX_FRAMES and ETHTOOL_COALESCE_MAX_FRAMES_IRQ. He also corrects an issue with NULL pointer caused by improper handling of dummy receive descriptors. Finally, he corrects debug prints reporting an unknown state. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: iavf: Fix missing state logs iavf: Fix handling of dummy receive descriptors iavf: Disallow changing rx/tx-frames and rx/tx-frames-irq iavf: Fix VLAN_V2 addition/rejection ==================== Link: https://lore.kernel.org/r/20220718174807.4113582-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> 20 July 2022, 00:43:02 UTC
c6b10de Documentation: fix udp_wmem_min in ip-sysctl.rst UDP doesn't support tx memory accounting, and sysctl udp_wmem_min is not really used anywhere. So we should fix the description in ip-sysctl.rst accordingly. Fixes: 95766fff6b9a ("[UDP]: Add memory accounting.") Signed-off-by: Xin Long <lucien.xin@gmail.com> Link: https://lore.kernel.org/r/c880a963d9b1fb5f442ae3c9e4dfa70d45296a16.1658167019.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> 20 July 2022, 00:34:53 UTC
53eb9b0 net: ethernet: mtk_ppe: fix possible NULL pointer dereference in mtk_flow_get_wdma_info odev pointer can be NULL in mtk_flow_offload_replace routine according to the flower action rules. Fix possible NULL pointer dereference in mtk_flow_get_wdma_info. Fixes: a333215e10cb5 ("net: ethernet: mtk_eth_soc: implement flow offloading to WED devices") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/4e1685bc4976e21e364055f6bee86261f8f9ee93.1658137753.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> 20 July 2022, 00:27:18 UTC
cdf0b86 r8152: fix a WOL issue This fixes that the platform is waked by an unexpected packet. The size and range of FIFO is different when the device enters S3 state, so it is necessary to correct some settings when suspending. Regardless of jumbo frame, set RMS to 1522 and MTPS to MTPS_DEFAULT. Besides, enable MCU_BORW_EN to update the method of calculating the pointer of data. Then, the hardware could get the correct data. Fixes: 195aae321c82 ("r8152: support new chips") Signed-off-by: Hayes Wang <hayeswang@realtek.com> Link: https://lore.kernel.org/r/20220718082120.10957-391-nic_swsd@realtek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> 20 July 2022, 00:10:56 UTC
b3fcfc4 Merge branch 'amt-fix-validation-and-synchronization-bugs' Taehee Yoo says: ==================== amt: fix validation and synchronization bugs There are some synchronization issues in the amt module. Especially, an amt gateway doesn't well synchronize its own variables and status(amt->status). It tries to use a workqueue for handles in a single thread. A global lock is also good, but it would occur complex locking complex. In this patchset, only the gateway uses workqueue. The reason why only gateway interface uses workqueue is that gateway should manage its own states and variables a little bit statefully. But relay doesn't need to manage tunnels statefully, stateless is okay. So, relay side message handlers are okay to be called concurrently. But it doesn't mean that no lock is needed. Only amt multicast data message type will not be processed by the work queue because It contains actual multicast data. So, it should be processed immediately. When any amt gateway events are triggered(sending discovery message by delayed_work, sending request message by delayed_work and receiving messages), it stores event and skb into the event queue(amt->events[16]). Then, workqueue processes these events one by one. The first patch is to use the work queue. The second patch is to remove unnecessary lock due to a previous patch. The third patch is to use READ_ONCE() in the amt module. Even if the amt module uses a single thread, some variables (ready4, ready6, amt->status) can be accessed concurrently. The fourth patch is to add missing nonce generation logic when it sends a new request message. The fifth patch is to drop unexpected advertisement messages. advertisement message should be received only after the gateway sends a discovery message first. So, the gateway should drop advertisement messages if it has never sent a discovery message and it also should drop duplicate advertisement messages. Using nonce is good to distinguish whether a received message is an expected message or not. The sixth patch is to drop unexpected query messages. This is the same behavior as the fourth patch. Query messages should be received only after the gateway sends a request message first. The nonce variable is used to distinguish whether it is a reply to a previous request message or not. amt->ready4 and amt->ready6 are used to distinguish duplicate messages. The seventh patch is to drop unexpected multicast data. AMT gateway should not receive multicast data message type before establish between gateway and relay. In order to drop unexpected multicast data messages, it checks amt->status. The last patch is to fix a locking problem on the relay side. amt->nr_tunnels variable is protected by amt->lock. But amt_request_handler() doesn't protect this variable. v2: - Use local_bh_disable() instead of rcu_read_lock_bh() in amt_membership_query_handler. - Fix using uninitialized variables. - Fix unexpectedly start the event_wq after stopping. - Fix possible deadlock in amt_event_work(). - Add a limit variable in amt_event_work() to prevent infinite working. - Rename amt_queue_events() to amt_queue_event(). ==================== Link: https://lore.kernel.org/r/20220717160910.19156-1-ap420073@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> 19 July 2022, 10:37:05 UTC
9899184 amt: do not use amt->nr_tunnels outside of lock amt->nr_tunnels is protected by amt->lock. But, amt_request_handler() has been using this variable without the amt->lock. So, it expands context of amt->lock in the amt_request_handler() to protect amt->nr_tunnels variable. Fixes: cbc21dc1cfe9 ("amt: add data plane of amt interface") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> 19 July 2022, 10:37:02 UTC
e882827 amt: drop unexpected multicast data AMT gateway interface should not receive unexpected multicast data. Multicast data message type should be received after sending an update message, which means all establishment between gateway and relay is finished. So, amt_multicast_data_handler() checks amt->status. Fixes: cbc21dc1cfe9 ("amt: add data plane of amt interface") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> 19 July 2022, 10:37:02 UTC
back to top