https://github.com/torvalds/linux
Revision 2b86cb8299765688c5119fd18d5f436716c81010 authored by Vladimir Oltean on 27 May 2020, 18:08:05 UTC, committed by David S. Miller on 27 May 2020, 22:09:28 UTC
Be there a platform with the following layout:

      Regular NIC
       |
       +----> DSA master for switch port
               |
               +----> DSA master for another switch port

After changing DSA back to static lockdep class keys in commit
1a33e10e4a95 ("net: partially revert dynamic lockdep key changes"), this
kernel splat can be seen:

[   13.361198] ============================================
[   13.366524] WARNING: possible recursive locking detected
[   13.371851] 5.7.0-rc4-02121-gc32a05ecd7af-dirty #988 Not tainted
[   13.377874] --------------------------------------------
[   13.383201] swapper/0/0 is trying to acquire lock:
[   13.388004] ffff0000668ff298 (&dsa_slave_netdev_xmit_lock_key){+.-.}-{2:2}, at: __dev_queue_xmit+0x84c/0xbe0
[   13.397879]
[   13.397879] but task is already holding lock:
[   13.403727] ffff0000661a1698 (&dsa_slave_netdev_xmit_lock_key){+.-.}-{2:2}, at: __dev_queue_xmit+0x84c/0xbe0
[   13.413593]
[   13.413593] other info that might help us debug this:
[   13.420140]  Possible unsafe locking scenario:
[   13.420140]
[   13.426075]        CPU0
[   13.428523]        ----
[   13.430969]   lock(&dsa_slave_netdev_xmit_lock_key);
[   13.435946]   lock(&dsa_slave_netdev_xmit_lock_key);
[   13.440924]
[   13.440924]  *** DEADLOCK ***
[   13.440924]
[   13.446860]  May be due to missing lock nesting notation
[   13.446860]
[   13.453668] 6 locks held by swapper/0/0:
[   13.457598]  #0: ffff800010003de0 ((&idev->mc_ifc_timer)){+.-.}-{0:0}, at: call_timer_fn+0x0/0x400
[   13.466593]  #1: ffffd4d3fb478700 (rcu_read_lock){....}-{1:2}, at: mld_sendpack+0x0/0x560
[   13.474803]  #2: ffffd4d3fb478728 (rcu_read_lock_bh){....}-{1:2}, at: ip6_finish_output2+0x64/0xb10
[   13.483886]  #3: ffffd4d3fb478728 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x6c/0xbe0
[   13.492793]  #4: ffff0000661a1698 (&dsa_slave_netdev_xmit_lock_key){+.-.}-{2:2}, at: __dev_queue_xmit+0x84c/0xbe0
[   13.503094]  #5: ffffd4d3fb478728 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x6c/0xbe0
[   13.512000]
[   13.512000] stack backtrace:
[   13.516369] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.7.0-rc4-02121-gc32a05ecd7af-dirty #988
[   13.530421] Call trace:
[   13.532871]  dump_backtrace+0x0/0x1d8
[   13.536539]  show_stack+0x24/0x30
[   13.539862]  dump_stack+0xe8/0x150
[   13.543271]  __lock_acquire+0x1030/0x1678
[   13.547290]  lock_acquire+0xf8/0x458
[   13.550873]  _raw_spin_lock+0x44/0x58
[   13.554543]  __dev_queue_xmit+0x84c/0xbe0
[   13.558562]  dev_queue_xmit+0x24/0x30
[   13.562232]  dsa_slave_xmit+0xe0/0x128
[   13.565988]  dev_hard_start_xmit+0xf4/0x448
[   13.570182]  __dev_queue_xmit+0x808/0xbe0
[   13.574200]  dev_queue_xmit+0x24/0x30
[   13.577869]  neigh_resolve_output+0x15c/0x220
[   13.582237]  ip6_finish_output2+0x244/0xb10
[   13.586430]  __ip6_finish_output+0x1dc/0x298
[   13.590709]  ip6_output+0x84/0x358
[   13.594116]  mld_sendpack+0x2bc/0x560
[   13.597786]  mld_ifc_timer_expire+0x210/0x390
[   13.602153]  call_timer_fn+0xcc/0x400
[   13.605822]  run_timer_softirq+0x588/0x6e0
[   13.609927]  __do_softirq+0x118/0x590
[   13.613597]  irq_exit+0x13c/0x148
[   13.616918]  __handle_domain_irq+0x6c/0xc0
[   13.621023]  gic_handle_irq+0x6c/0x160
[   13.624779]  el1_irq+0xbc/0x180
[   13.627927]  cpuidle_enter_state+0xb4/0x4d0
[   13.632120]  cpuidle_enter+0x3c/0x50
[   13.635703]  call_cpuidle+0x44/0x78
[   13.639199]  do_idle+0x228/0x2c8
[   13.642433]  cpu_startup_entry+0x2c/0x48
[   13.646363]  rest_init+0x1ac/0x280
[   13.649773]  arch_call_rest_init+0x14/0x1c
[   13.653878]  start_kernel+0x490/0x4bc

Lockdep keys themselves were added in commit ab92d68fc22f ("net: core:
add generic lockdep keys"), and it's very likely that this splat existed
since then, but I have no real way to check, since this stacked platform
wasn't supported by mainline back then.

>From Taehee's own words:

  This patch was considered that all stackable devices have LLTX flag.
  But the dsa doesn't have LLTX, so this splat happened.
  After this patch, dsa shares the same lockdep class key.
  On the nested dsa interface architecture, which you illustrated,
  the same lockdep class key will be used in __dev_queue_xmit() because
  dsa doesn't have LLTX.
  So that lockdep detects deadlock because the same lockdep class key is
  used recursively although actually the different locks are used.
  There are some ways to fix this problem.

  1. using NETIF_F_LLTX flag.
  If possible, using the LLTX flag is a very clear way for it.
  But I'm so sorry I don't know whether the dsa could have LLTX or not.

  2. using dynamic lockdep again.
  It means that each interface uses a separate lockdep class key.
  So, lockdep will not detect recursive locking.
  But this way has a problem that it could consume lockdep class key
  too many.
  Currently, lockdep can have 8192 lockdep class keys.
   - you can see this number with the following command.
     cat /proc/lockdep_stats
     lock-classes:                         1251 [max: 8192]
     ...
     The [max: 8192] means that the maximum number of lockdep class keys.
  If too many lockdep class keys are registered, lockdep stops to work.
  So, using a dynamic(separated) lockdep class key should be considered
  carefully.
  In addition, updating lockdep class key routine might have to be existing.
  (lockdep_register_key(), lockdep_set_class(), lockdep_unregister_key())

  3. Using lockdep subclass.
  A lockdep class key could have 8 subclasses.
  The different subclass is considered different locks by lockdep
  infrastructure.
  But "lock-classes" is not counted by subclasses.
  So, it could avoid stopping lockdep infrastructure by an overflow of
  lockdep class keys.
  This approach should also have an updating lockdep class key routine.
  (lockdep_set_subclass())

  4. Using nonvalidate lockdep class key.
  The lockdep infrastructure supports nonvalidate lockdep class key type.
  It means this lockdep is not validated by lockdep infrastructure.
  So, the splat will not happen but lockdep couldn't detect real deadlock
  case because lockdep really doesn't validate it.
  I think this should be used for really special cases.
  (lockdep_set_novalidate_class())

Further discussion here:
https://patchwork.ozlabs.org/project/netdev/patch/20200503052220.4536-2-xiyou.wangcong@gmail.com/

There appears to be no negative side-effect to declaring lockless TX for
the DSA virtual interfaces, which means they handle their own locking.
So that's what we do to make the splat go away.

Patch tested in a wide variety of cases: unicast, multicast, PTP, etc.

Fixes: ab92d68fc22f ("net: core: add generic lockdep keys")
Suggested-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1 parent 183be6f
History
Tip revision: 2b86cb8299765688c5119fd18d5f436716c81010 authored by Vladimir Oltean on 27 May 2020, 18:08:05 UTC
net: dsa: declare lockless TX feature for slave ports
Tip revision: 2b86cb8
File Mode Size
842
crypto
dim
fonts
kunit
livepatch
lz4
lzo
math
mpi
raid6
reed_solomon
vdso
xz
zlib_deflate
zlib_dfltcc
zlib_inflate
zstd
.gitignore -rw-r--r-- 116 bytes
Kconfig -rw-r--r-- 14.9 KB
Kconfig.debug -rw-r--r-- 72.3 KB
Kconfig.kasan -rw-r--r-- 5.9 KB
Kconfig.kgdb -rw-r--r-- 4.1 KB
Kconfig.ubsan -rw-r--r-- 2.8 KB
Makefile -rw-r--r-- 10.1 KB
argv_split.c -rw-r--r-- 2.1 KB
ashldi3.c -rw-r--r-- 541 bytes
ashrdi3.c -rw-r--r-- 565 bytes
asn1_decoder.c -rw-r--r-- 13.2 KB
assoc_array.c -rw-r--r-- 51.8 KB
atomic64.c -rw-r--r-- 4.5 KB
atomic64_test.c -rw-r--r-- 6.4 KB
audit.c -rw-r--r-- 1.8 KB
bcd.c -rw-r--r-- 297 bytes
bch.c -rw-r--r-- 36.1 KB
bitmap.c -rw-r--r-- 37.9 KB
bitrev.c -rw-r--r-- 1.9 KB
bootconfig.c -rw-r--r-- 18.3 KB
bsearch.c -rw-r--r-- 1.4 KB
btree.c -rw-r--r-- 19.2 KB
bucket_locks.c -rw-r--r-- 1.4 KB
bug.c -rw-r--r-- 5.8 KB
build_OID_registry -rwxr-xr-x 4.5 KB
bust_spinlocks.c -rw-r--r-- 676 bytes
check_signature.c -rw-r--r-- 635 bytes
checksum.c -rw-r--r-- 4.8 KB
clz_ctz.c -rw-r--r-- 1.2 KB
clz_tab.c -rw-r--r-- 891 bytes
cmdline.c -rw-r--r-- 5.1 KB
cmpdi2.c -rw-r--r-- 501 bytes
compat_audit.c -rw-r--r-- 832 bytes
cpu_rmap.c -rw-r--r-- 7.6 KB
cpumask.c -rw-r--r-- 6.7 KB
crc-ccitt.c -rw-r--r-- 5.6 KB
crc-itu-t.c -rw-r--r-- 2.7 KB
crc-t10dif.c -rw-r--r-- 2.9 KB
crc16.c -rw-r--r-- 2.7 KB
crc32.c -rw-r--r-- 9.3 KB
crc32defs.h -rw-r--r-- 1.6 KB
crc32test.c -rw-r--r-- 37.5 KB
crc4.c -rw-r--r-- 1003 bytes
crc64.c -rw-r--r-- 1.8 KB
crc7.c -rw-r--r-- 2.5 KB
crc8.c -rw-r--r-- 2.4 KB
ctype.c -rw-r--r-- 1.4 KB
debug_info.c -rw-r--r-- 777 bytes
debug_locks.c -rw-r--r-- 1.2 KB
debugobjects.c -rw-r--r-- 34.8 KB
dec_and_lock.c -rw-r--r-- 1.2 KB
decompress.c -rw-r--r-- 1.7 KB
decompress_bunzip2.c -rw-r--r-- 23.5 KB
decompress_inflate.c -rw-r--r-- 4.8 KB
decompress_unlz4.c -rw-r--r-- 4.0 KB
decompress_unlzma.c -rw-r--r-- 15.8 KB
decompress_unlzo.c -rw-r--r-- 6.4 KB
decompress_unxz.c -rw-r--r-- 10.9 KB
devres.c -rw-r--r-- 11.7 KB
digsig.c -rw-r--r-- 5.5 KB
dump_stack.c -rw-r--r-- 3.2 KB
dynamic_debug.c -rw-r--r-- 26.7 KB
dynamic_queue_limits.c -rw-r--r-- 4.3 KB
earlycpio.c -rw-r--r-- 3.6 KB
errname.c -rw-r--r-- 3.6 KB
error-inject.c -rw-r--r-- 5.4 KB
errseq.c -rw-r--r-- 6.6 KB
extable.c -rw-r--r-- 3.0 KB
fault-inject.c -rw-r--r-- 5.8 KB
fdt.c -rw-r--r-- 69 bytes
fdt_addresses.c -rw-r--r-- 79 bytes
fdt_empty_tree.c -rw-r--r-- 80 bytes
fdt_ro.c -rw-r--r-- 72 bytes
fdt_rw.c -rw-r--r-- 72 bytes
fdt_strerror.c -rw-r--r-- 78 bytes
fdt_sw.c -rw-r--r-- 72 bytes
fdt_wip.c -rw-r--r-- 73 bytes
find_bit.c -rw-r--r-- 4.5 KB
find_bit_benchmark.c -rw-r--r-- 3.9 KB
flex_proportions.c -rw-r--r-- 6.9 KB
gen_crc32table.c -rw-r--r-- 3.3 KB
gen_crc64table.c -rw-r--r-- 1.4 KB
genalloc.c -rw-r--r-- 26.1 KB
generic-radix-tree.c -rw-r--r-- 5.3 KB
glob.c -rw-r--r-- 3.5 KB
globtest.c -rw-r--r-- 4.2 KB
hexdump.c -rw-r--r-- 7.3 KB
hweight.c -rw-r--r-- 1.9 KB
idr.c -rw-r--r-- 17.3 KB
inflate.c -rw-r--r-- 38.7 KB
interval_tree.c -rw-r--r-- 540 bytes
interval_tree_test.c -rw-r--r-- 3.4 KB
iomap.c -rw-r--r-- 9.1 KB
iomap_copy.c -rw-r--r-- 2.2 KB
iommu-helper.c -rw-r--r-- 755 bytes
ioremap.c -rw-r--r-- 6.1 KB
iov_iter.c -rw-r--r-- 42.0 KB
irq_poll.c -rw-r--r-- 5.4 KB
irq_regs.c -rw-r--r-- 394 bytes
is_single_threaded.c -rw-r--r-- 1.2 KB
kasprintf.c -rw-r--r-- 1.4 KB
kfifo.c -rw-r--r-- 12.1 KB
klist.c -rw-r--r-- 10.4 KB
kobject.c -rw-r--r-- 27.9 KB
kobject_uevent.c -rw-r--r-- 18.8 KB
kstrtox.c -rw-r--r-- 10.5 KB
kstrtox.h -rw-r--r-- 293 bytes
libcrc32c.c -rw-r--r-- 2.0 KB
list-test.c -rw-r--r-- 17.4 KB
list_debug.c -rw-r--r-- 1.8 KB
list_sort.c -rw-r--r-- 8.4 KB
llist.c -rw-r--r-- 2.5 KB
locking-selftest-hardirq.h -rw-r--r-- 246 bytes
locking-selftest-mutex.h -rw-r--r-- 159 bytes
locking-selftest-rlock-hardirq.h -rw-r--r-- 74 bytes
locking-selftest-rlock-softirq.h -rw-r--r-- 74 bytes
locking-selftest-rlock.h -rw-r--r-- 197 bytes
locking-selftest-rsem.h -rw-r--r-- 202 bytes
locking-selftest-rtmutex.h -rw-r--r-- 162 bytes
locking-selftest-softirq.h -rw-r--r-- 246 bytes
locking-selftest-spin-hardirq.h -rw-r--r-- 73 bytes
locking-selftest-spin-softirq.h -rw-r--r-- 73 bytes
locking-selftest-spin.h -rw-r--r-- 157 bytes
locking-selftest-wlock-hardirq.h -rw-r--r-- 74 bytes
locking-selftest-wlock-softirq.h -rw-r--r-- 74 bytes
locking-selftest-wlock.h -rw-r--r-- 197 bytes
locking-selftest-wsem.h -rw-r--r-- 202 bytes
locking-selftest.c -rw-r--r-- 43.7 KB
lockref.c -rw-r--r-- 4.5 KB
logic_pio.c -rw-r--r-- 8.4 KB
lru_cache.c -rw-r--r-- 18.8 KB
lshrdi3.c -rw-r--r-- 559 bytes
memcat_p.c -rw-r--r-- 753 bytes
memory-notifier-error-inject.c -rw-r--r-- 1.1 KB
memregion.c -rw-r--r-- 400 bytes
memweight.c -rw-r--r-- 1.0 KB
muldi3.c -rw-r--r-- 1.7 KB
net_utils.c -rw-r--r-- 640 bytes
netdev-notifier-error-inject.c -rw-r--r-- 1.5 KB
nlattr.c -rw-r--r-- 22.6 KB
nmi_backtrace.c -rw-r--r-- 3.0 KB
nodemask.c -rw-r--r-- 653 bytes
notifier-error-inject.c -rw-r--r-- 2.5 KB
notifier-error-inject.h -rw-r--r-- 653 bytes
objagg.c -rw-r--r-- 28.3 KB
of-reconfig-notifier-error-inject.c -rw-r--r-- 1.3 KB
oid_registry.c -rw-r--r-- 3.7 KB
once.c -rw-r--r-- 1.4 KB
packing.c -rw-r--r-- 6.5 KB
parman.c -rw-r--r-- 10.6 KB
parser.c -rw-r--r-- 8.1 KB
pci_iomap.c -rw-r--r-- 4.2 KB
percpu-refcount.c -rw-r--r-- 13.6 KB
percpu_counter.c -rw-r--r-- 5.8 KB
percpu_test.c -rw-r--r-- 3.2 KB
plist.c -rw-r--r-- 5.9 KB
pm-notifier-error-inject.c -rw-r--r-- 1.2 KB
radix-tree.c -rw-r--r-- 43.1 KB
random32.c -rw-r--r-- 12.8 KB
ratelimit.c -rw-r--r-- 1.6 KB
rbtree.c -rw-r--r-- 17.1 KB
rbtree_test.c -rw-r--r-- 9.4 KB
refcount.c -rw-r--r-- 4.8 KB
rhashtable.c -rw-r--r-- 29.4 KB
sbitmap.c -rw-r--r-- 16.3 KB
scatterlist.c -rw-r--r-- 25.2 KB
seq_buf.c -rw-r--r-- 9.9 KB
sg_pool.c -rw-r--r-- 4.2 KB
sg_split.c -rw-r--r-- 5.0 KB
sha1.c -rw-r--r-- 6.1 KB
show_mem.c -rw-r--r-- 1.1 KB
siphash.c -rw-r--r-- 12.0 KB
smp_processor_id.c -rw-r--r-- 1.4 KB
sort.c -rw-r--r-- 8.4 KB
stackdepot.c -rw-r--r-- 9.5 KB
stmp_device.c -rw-r--r-- 1.9 KB
string.c -rw-r--r-- 24.7 KB
string_helpers.c -rw-r--r-- 13.9 KB
strncpy_from_user.c -rw-r--r-- 3.2 KB
strnlen_user.c -rw-r--r-- 3.3 KB
syscall.c -rw-r--r-- 2.5 KB
test-kstrtox.c -rw-r--r-- 17.3 KB
test-string_helpers.c -rw-r--r-- 10.3 KB
test_bitfield.c -rw-r--r-- 4.3 KB
test_bitmap.c -rw-r--r-- 16.9 KB
test_blackhole_dev.c -rw-r--r-- 2.5 KB
test_bpf.c -rw-r--r-- 161.1 KB
test_debug_virtual.c -rw-r--r-- 981 bytes
test_firmware.c -rw-r--r-- 24.4 KB
test_hash.c -rw-r--r-- 6.3 KB
test_hexdump.c -rw-r--r-- 6.3 KB
test_ida.c -rw-r--r-- 4.3 KB
test_kasan.c -rw-r--r-- 17.2 KB
test_kmod.c -rw-r--r-- 29.9 KB
test_list_sort.c -rw-r--r-- 3.3 KB
test_lockup.c -rw-r--r-- 15.7 KB
test_memcat_p.c -rw-r--r-- 2.2 KB
test_meminit.c -rw-r--r-- 9.7 KB
test_min_heap.c -rw-r--r-- 4.3 KB
test_module.c -rw-r--r-- 794 bytes
test_objagg.c -rw-r--r-- 24.6 KB
test_overflow.c -rw-r--r-- 22.3 KB
test_parman.c -rw-r--r-- 11.2 KB
test_printf.c -rw-r--r-- 16.3 KB
test_rhashtable.c -rw-r--r-- 20.0 KB
test_siphash.c -rw-r--r-- 7.5 KB
test_sort.c -rw-r--r-- 870 bytes
test_stackinit.c -rw-r--r-- 11.3 KB
test_static_key_base.c -rw-r--r-- 1.6 KB
test_static_keys.c -rw-r--r-- 5.6 KB
test_string.c -rw-r--r-- 3.8 KB
test_strscpy.c -rw-r--r-- 4.0 KB
test_sysctl.c -rw-r--r-- 3.7 KB
test_ubsan.c -rw-r--r-- 2.4 KB
test_user_copy.c -rw-r--r-- 9.1 KB
test_uuid.c -rw-r--r-- 3.4 KB
test_vmalloc.c -rw-r--r-- 10.6 KB
test_xarray.c -rw-r--r-- 44.5 KB
textsearch.c -rw-r--r-- 9.3 KB
timerqueue.c -rw-r--r-- 2.5 KB
ts_bm.c -rw-r--r-- 5.1 KB
ts_fsm.c -rw-r--r-- 10.4 KB
ts_kmp.c -rw-r--r-- 4.1 KB
ubsan.c -rw-r--r-- 10.4 KB
ubsan.h -rw-r--r-- 1.6 KB
ucmpdi2.c -rw-r--r-- 568 bytes
ucs2_string.c -rw-r--r-- 2.5 KB
usercopy.c -rw-r--r-- 2.0 KB
uuid.c -rw-r--r-- 2.9 KB
vsprintf.c -rw-r--r-- 81.5 KB
win_minmax.c -rw-r--r-- 3.4 KB
xarray.c -rw-r--r-- 53.3 KB
xxhash.c -rw-r--r-- 12.7 KB

back to top