https://github.com/torvalds/linux
Revision 21b5944350052d2583e82dd59b19a9ba94a007f0 authored by Eric W. Biederman on 19 December 2017, 17:27:56 UTC, committed by David S. Miller on 20 December 2017, 17:42:22 UTC
(I can trivially verify that that idr_remove in cleanup_net happens
 after the network namespace count has dropped to zero --EWB)

Function get_net_ns_by_id() does not check for net::count
after it has found a peer in netns_ids idr.

It may dereference a peer, after its count has already been
finaly decremented. This leads to double free and memory
corruption:

put_net(peer)                                   rtnl_lock()
atomic_dec_and_test(&peer->count) [count=0]     ...
__put_net(peer)                                 get_net_ns_by_id(net, id)
  spin_lock(&cleanup_list_lock)
  list_add(&net->cleanup_list, &cleanup_list)
  spin_unlock(&cleanup_list_lock)
queue_work()                                      peer = idr_find(&net->netns_ids, id)
  |                                               get_net(peer) [count=1]
  |                                               ...
  |                                               (use after final put)
  v                                               ...
  cleanup_net()                                   ...
    spin_lock(&cleanup_list_lock)                 ...
    list_replace_init(&cleanup_list, ..)          ...
    spin_unlock(&cleanup_list_lock)               ...
    ...                                           ...
    ...                                           put_net(peer)
    ...                                             atomic_dec_and_test(&peer->count) [count=0]
    ...                                               spin_lock(&cleanup_list_lock)
    ...                                               list_add(&net->cleanup_list, &cleanup_list)
    ...                                               spin_unlock(&cleanup_list_lock)
    ...                                             queue_work()
    ...                                           rtnl_unlock()
    rtnl_lock()                                   ...
    for_each_net(tmp) {                           ...
      id = __peernet2id(tmp, peer)                ...
      spin_lock_irq(&tmp->nsid_lock)              ...
      idr_remove(&tmp->netns_ids, id)             ...
      ...                                         ...
      net_drop_ns()                               ...
	net_free(peer)                            ...
    }                                             ...
  |
  v
  cleanup_net()
    ...
    (Second free of peer)

Also, put_net() on the right cpu may reorder with left's cpu
list_replace_init(&cleanup_list, ..), and then cleanup_list
will be corrupted.

Since cleanup_net() is executed in worker thread, while
put_net(peer) can happen everywhere, there should be
enough time for concurrent get_net_ns_by_id() to pick
the peer up, and the race does not seem to be unlikely.
The patch fixes the problem in standard way.

(Also, there is possible problem in peernet2id_alloc(), which requires
check for net::count under nsid_lock and maybe_get_net(peer), but
in current stable kernel it's used under rtnl_lock() and it has to be
safe. Openswitch begun to use peernet2id_alloc(), and possibly it should
be fixed too. While this is not in stable kernel yet, so I'll send
a separate message to netdev@ later).

Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Fixes: 0c7aecd4bde4 "netns: add rtnl cmd to add and get peer netns ids"
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
1 parent eda9873
History
Tip revision: 21b5944350052d2583e82dd59b19a9ba94a007f0 authored by Eric W. Biederman on 19 December 2017, 17:27:56 UTC
net: Fix double free and memory corruption in get_net_ns_by_id()
Tip revision: 21b5944
File Mode Size
partitions
Kconfig -rw-r--r-- 6.2 KB
Kconfig.iosched -rw-r--r-- 2.7 KB
Makefile -rw-r--r-- 1.5 KB
badblocks.c -rw-r--r-- 14.5 KB
bfq-cgroup.c -rw-r--r-- 33.6 KB
bfq-iosched.c -rw-r--r-- 170.4 KB
bfq-iosched.h -rw-r--r-- 31.4 KB
bfq-wf2q.c -rw-r--r-- 52.0 KB
bio-integrity.c -rw-r--r-- 13.8 KB
bio.c -rw-r--r-- 49.5 KB
blk-cgroup.c -rw-r--r-- 37.8 KB
blk-core.c -rw-r--r-- 101.1 KB
blk-exec.c -rw-r--r-- 2.9 KB
blk-flush.c -rw-r--r-- 17.0 KB
blk-integrity.c -rw-r--r-- 12.1 KB
blk-ioc.c -rw-r--r-- 11.0 KB
blk-lib.c -rw-r--r-- 10.6 KB
blk-map.c -rw-r--r-- 5.8 KB
blk-merge.c -rw-r--r-- 19.7 KB
blk-mq-cpumap.c -rw-r--r-- 1.7 KB
blk-mq-debugfs.c -rw-r--r-- 23.5 KB
blk-mq-debugfs.h -rw-r--r-- 2.1 KB
blk-mq-pci.c -rw-r--r-- 1.6 KB
blk-mq-rdma.c -rw-r--r-- 1.7 KB
blk-mq-sched.c -rw-r--r-- 15.9 KB
blk-mq-sched.h -rw-r--r-- 2.8 KB
blk-mq-sysfs.c -rw-r--r-- 8.2 KB
blk-mq-tag.c -rw-r--r-- 11.3 KB
blk-mq-tag.h -rw-r--r-- 2.3 KB
blk-mq-virtio.c -rw-r--r-- 1.7 KB
blk-mq.c -rw-r--r-- 72.3 KB
blk-mq.h -rw-r--r-- 5.2 KB
blk-settings.c -rw-r--r-- 28.7 KB
blk-softirq.c -rw-r--r-- 4.3 KB
blk-stat.c -rw-r--r-- 4.8 KB
blk-stat.h -rw-r--r-- 5.5 KB
blk-sysfs.c -rw-r--r-- 23.4 KB
blk-tag.c -rw-r--r-- 10.0 KB
blk-throttle.c -rw-r--r-- 67.8 KB
blk-timeout.c -rw-r--r-- 5.8 KB
blk-wbt.c -rw-r--r-- 17.5 KB
blk-wbt.h -rw-r--r-- 4.0 KB
blk-zoned.c -rw-r--r-- 7.6 KB
blk.h -rw-r--r-- 9.9 KB
bounce.c -rw-r--r-- 6.5 KB
bsg-lib.c -rw-r--r-- 7.5 KB
bsg.c -rw-r--r-- 22.8 KB
cfq-iosched.c -rw-r--r-- 127.1 KB
cmdline-parser.c -rw-r--r-- 4.9 KB
compat_ioctl.c -rw-r--r-- 10.9 KB
deadline-iosched.c -rw-r--r-- 11.2 KB
elevator.c -rw-r--r-- 27.2 KB
genhd.c -rw-r--r-- 47.8 KB
ioctl.c -rw-r--r-- 15.2 KB
ioprio.c -rw-r--r-- 5.0 KB
kyber-iosched.c -rw-r--r-- 21.3 KB
mq-deadline.c -rw-r--r-- 16.8 KB
noop-iosched.c -rw-r--r-- 2.6 KB
opal_proto.h -rw-r--r-- 9.3 KB
partition-generic.c -rw-r--r-- 16.8 KB
scsi_ioctl.c -rw-r--r-- 19.9 KB
sed-opal.c -rw-r--r-- 58.5 KB
t10-pi.c -rw-r--r-- 4.9 KB

back to top